Tag: fastq

How do I get separate ADT / CITE-seq fastq’s from single SRA / BAM files? (originally generated from cellranger)

How do I get separate ADT / CITE-seq fastq’s from single SRA / BAM files? (originally generated from cellranger) 0 Hello all. I am trying to pre-process some single cell RNA and ADT (Totalseq-C) data from an GEO SRA, but having some issues getting separate fastq’s for the “CITE-seq” (ADT)…

Continue Reading How do I get separate ADT / CITE-seq fastq’s from single SRA / BAM files? (originally generated from cellranger)

HISAT2 and HTSEQ command

HISAT2 and HTSEQ command 0 @4fedfa78 Last seen 10 hours ago Japan Hello, For analysis of the data by rna-sequencing I selected HISAT2 and HTSeq-count for mapping and counting the genes levels, the libraryLayout is paired, I am using the below command for both but the results are not exact…

Continue Reading HISAT2 and HTSEQ command

Job – Principal Biostistician/Bioinformatics job at Kenya Medical Research

Vacancy title: Principal Biostistician/Bioinformatics [ Type: FULL TIME , Industry: Research , Category: Research ] Jobs at: Kenya Medical Research – KEMRI Deadline of this Job: 06 October 2022   Duty Station: Within Kenya , Kisumu , East Africa SummaryDate Posted: Tuesday, September 20, 2022 , Base Salary: Not Disclosed…

Continue Reading Job – Principal Biostistician/Bioinformatics job at Kenya Medical Research

Comment: How to resolve a ValueError: Multiple 'HD' Lines are not permitted when I run Ci

I tried your suggestion **samtools view -H qname_unknown_circle.bam** and the output result is like this: yu@root:~$ samtools view -H qname_unknown_circle.bam @HD VN:1.5 SO:queryname @SQ SN:chr1 LN:248956422 @SQ SN:chr10 LN:133797422 @SQ SN:chr11 LN:135086622 ………(Many lines like this ‘@SQ SN:chrxx LN:xxxx’ are omitted) @SQ SN:chrY_KI270740v1_random LN:37240 @HD VN:1.5 SO:unsorted GO:query @PG ID:bwa…

Continue Reading Comment: How to resolve a ValueError: Multiple 'HD' Lines are not permitted when I run Ci

Run fastqc reports for each of the fastq files in:

Run fastqc reports for each of the fastq files in: /data/CompRes/seq_platform_data/ To run the fastqc reports, you can write a script that will execute the following command (edited for you) for each file: fastqc /data/CompRes/seq_platform_data/<inputfastq> -o /home/ASURITE/BIO439/module4/ Then transfer the html files to your local computer and view them, so…

Continue Reading Run fastqc reports for each of the fastq files in:

CNV Pipeline Options

The following are the top-level options that are shared with the DRAGEN Host Software to control the CNV pipeline. You can input a BAM or CRAM file into the CNV pipeline. If you are using the DRAGEN mapper and aligner, you can use FASTQ files. …

Continue Reading CNV Pipeline Options

Bioinformatics Scientist in Pittsburgh, PA

Description Purpose:The scientist works independently using a robust math toolbox to discover solutions for a diverse portfolio of interesting and challenging problems. The scientist develops, implements, and monitors advanced analytic, medical informatics, and predictive modeling tools for health care programs at the UPMC. The scientist normally works Monday through Friday…

Continue Reading Bioinformatics Scientist in Pittsburgh, PA

DRAGEN .bcl conversion error due to improper custom p5 oligos

DRAGEN .bcl conversion error due to improper custom p5 oligos 0 I am working on a custom library preparation method and I designed my p5 oligos incorrectly. To be specific, I used the reverse complement of the correct p5 index sequence. As a result my fastq files aren’t demultiplexing properly….

Continue Reading DRAGEN .bcl conversion error due to improper custom p5 oligos

Understanding bam tracks

Understanding bam tracks 0 Sorry i am having trouble understanding this concept. For example, when I view a bam file after alignment in igv, I see that there are different tracks that form. How are these tracks formed/why do some aligned sequences belong together or are part of the same…

Continue Reading Understanding bam tracks

Evolution of stickleback spines through independent cis-regulatory changes at HOXDB

Darwin, C. On the Origin of Species by Means of Natural Selection (John Murray, 1859). Owen, R. On the Archetype and Homologies of the Vertebrate Skeleton (Richard and John E. Taylor, 1848). Stern, D. L. & Orgogozo, V. Is genetic evolution predictable? Science 323, 746–751 (2009). CAS  PubMed  PubMed Central …

Continue Reading Evolution of stickleback spines through independent cis-regulatory changes at HOXDB

What is the Difference Between FASTA and FASTQ

The key difference between FASTA and FASTQ is that FASTA is a text-based format that only stores nucleotide or protein sequences, while FASTQ is a text-based format that stores both sequence and associated sequence quality values. Bioinformatics is a field that uses different software to analyse and understand biological data,…

Continue Reading What is the Difference Between FASTA and FASTQ

iPSCs derived from infertile men carrying complex genetic abnormalities can generate primordial germ-like cells

Patients and controls The patient 1 was 38 years old and consulted for infertility after he and his partner had been trying to conceive for 2 years. The patient was the first child of unrelated parents, and he had four brothers and five sisters whose fertility status could not be determined…

Continue Reading iPSCs derived from infertile men carrying complex genetic abnormalities can generate primordial germ-like cells

mapping – STAR error in snakemake pipeline: “EXITING because of FATAL ERROR: could not open genome file”

I’m trying to use a 2 pass STAR mapping strategy (also explained here informatics.fas.harvard.edu/rsem-example-on-odyssey.html), but I’m getting an error. I’ve read through this page [https://github.com/alexdobin/STAR/issues/181] and I have a similar issue, but the discussed solutions don’t seem to help. Perhaps this is more a snakemake issue rather than a STAR…

Continue Reading mapping – STAR error in snakemake pipeline: “EXITING because of FATAL ERROR: could not open genome file”

Patrick Murphy Bulk RNA-Seq – HackMD

Patrick Murphy Bulk RNA-Seq – HackMD        owned this note   Published Linked with GitHub — title: ‘Patrick Murphy Bulk RNA-Seq’ disqus: hackmd — Patrick Murphy bulk RNA-Seq Analysis === ## Table of Contents [TOC] ## 1. Introduction This is a bulk RNA-Seq project, which includes human data….

Continue Reading Patrick Murphy Bulk RNA-Seq – HackMD

TPM normalization starting with read counts

Hello everyone I have multiple bulk RNA-seq datasets that I need to apply the same pipe line on. I want to normalize them from counts data to TPM. In all datasets, I have the genes as rows, and samples as columns. Unfortunately, I don’t have the fastq files, all I…

Continue Reading TPM normalization starting with read counts

Setting up Aspera Connect (ascp) on Linux and macOS

This tiny tutorial cover setting up Aspera Connect (binary is called ascp) which might be used to download sequencing data, e.g. with download links provided by sra-explorer.info, see also sra-explorer : find SRA and FastQ download URLs in a couple of clicks Setting up Aspera Connect is simple and was…

Continue Reading Setting up Aspera Connect (ascp) on Linux and macOS

ViReMa not working after adapter trimming

ViReMa not working after adapter trimming 0 Hi, I have a viral RNA-seq dataset that I am trying to run through ViReMa to look for deletion junctions. When I input my fastq files directly into ViReMa without trimming adapters first, my results look about how I would expect (multiple junctions…

Continue Reading ViReMa not working after adapter trimming

Targeted inhibition of ubiquitin signaling reverses metabolic reprogramming and suppresses glioblastoma growth

Cell culture Human glioblastoma cells (U87MG and U87MG-Luc) and human embryonic kidney cells (HEK293) were obtained from the American Type Culture Collection (Manassas, Va.). Cells were cultured in Dulbecco’s Modified Eagle Medium supplemented with 10% fetal bovine serum (Gibco™ Fetal Bovine Serum South America, Thermo Scientific Fisher-US), 2 mM l-glutamine, 50 U/ml…

Continue Reading Targeted inhibition of ubiquitin signaling reverses metabolic reprogramming and suppresses glioblastoma growth

Genomic architecture of adaptive radiation and hybridization in Alpine whitefish

Sampling the radiation To understand the phylogenetic relationships between Alpine whitefish, we carried out whole-genome resequencing on 96 previously collected whitefish (with associated phenotypic measurements including standard length and gill-raker counts; collected in accordance with permits issued by the cantons of Zurich (ZH128/15), Bern (BE68/15), and Lucerne (LU04/14); these fish…

Continue Reading Genomic architecture of adaptive radiation and hybridization in Alpine whitefish

I have a query regarding differential gene expression using limma-voom.

I have a query regarding differential gene expression using limma-voom. 1 @28946033 Last seen 1 day ago India I used the following pipeline for RNA Seq Analysis Fastq-Trimmomatic- Hisat2(gtf file was annotated)-featurecounts After featurecounts I tried to do limmavoom, but I get error saying this An error occurred with this…

Continue Reading I have a query regarding differential gene expression using limma-voom.

Could not locate a HISAT2 index to basename

Could not locate a HISAT2 index to basename 2 First time trying out HISAT2 and I’m having a problem here, even with the pre-made indices for GRCH38. $ hisat2 -x /share/projects/RNASeq/data/reference/GRCh38/grch38_tran -1 /home/echang/PANCANCER-030817-JE3-35880845/KTP-10-43736695/KTP-10_S3_L001_R1_001.fastq.gz -2 /home/echang/PANCANCER-030817-JE3-35880845/KTP-10-43736695/KTP-10_S3_L001_R2_001.fastq.gz -S tmp.sam Error follows Could not locate a HISAT2 index corresponding to basename “/share/projects/RNASeq/data/reference/GRCh38/grch38_tran” Error:…

Continue Reading Could not locate a HISAT2 index to basename

Samtools Htslib Issues

Issue Title State Comments Created Date Updated Date How to get a specific chromosome open 1 2022-07-14 2022-07-18 tabix returns row from VCF file multiple times open 4 2022-07-11 2022-07-18 Modified base parsing failure failure closed 0 2022-07-01 2022-07-18 extract genotype information open 1 2022-06-24 2022-07-18 sam_hdr_remove_lines is inefficient if…

Continue Reading Samtools Htslib Issues

Error in Importing data in qiime2

Error in Importing data in qiime2 0 Hello All, I am trying to provide an input of my amplicon sequencing files which are in .fq format for qiime2 and I am getting error There was a problem importing fastqcpwo: Missing one or more files for CasavaOneEightSingleLanePerSampleDirFmt: ‘.+_.+_L[0-9][0-9][0-9]_R[12]_001\\.fastq\\.gz’ Can someone please…

Continue Reading Error in Importing data in qiime2

BWA alignment/Samtools; Fail to read the header

BWA alignment/Samtools; Fail to read the header 0 Hello, I have an issue with my alignment. This is an error in my log file: fail to read the header from “-“. Here is my script: bwa mem -t 8 -R “@RG\tID:$2\tSM:$3” ~/scratch/pt6/pt6.fa ${1}_1.fastq.gz ${1}_2.fastq.gz 2>log.bwa_new.$1 |samtools view -S -h -b…

Continue Reading BWA alignment/Samtools; Fail to read the header

PeerJ expertRxiv – Postdoctoral Associate in Bioinformatics

Job description Location: Boca Raton, Florida Job Description: The College of Medicine of Florida Atlantic University, the 5th public university in Florida, is seeking a Bioinformatics Postdoctoral Associate with experience in bioinformatics pipeline development and genomics data analysis for a Bioinformatics and Computational Genomics laboratory which focuses on high-throughput…

Continue Reading PeerJ expertRxiv – Postdoctoral Associate in Bioinformatics

Hisat2 – stringtie – deseq2 pipeline for bulk RNA seq

Software official website : Hisat2: Manual | HISAT2 StringTie:StringTie article :Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown | Nature Protocols It is recommended to watch the nanny level tutorial : 1. RNA-seq : Hisat2+Stringtie+DESeq2 – Hengnuo Xinzhi 2. RNA-seq use hisat2、stringtie、DESeq2 analysis – Simple books Basic usage…

Continue Reading Hisat2 – stringtie – deseq2 pipeline for bulk RNA seq

Differing number of reads for read 1 and 2 in fastq’s from a subset bam file

I’m working with paired-end wgs data I downloaded from TCGA. I’m trying to extract the reads that align to a specific region, extract only those reads to two fastq files, one for each pair. Unfortunately, I am getting a different number of reads in both fastq files because some of…

Continue Reading Differing number of reads for read 1 and 2 in fastq’s from a subset bam file

Genetic characterization of two G8P[8] rotavirus strains isolated in Guangzhou, China, in 2020/21: evidence of genome reassortment | BMC Infectious Diseases

Mokomane M, Kasvosve I, Melo Ed, Pernica JM, Goldfarb DM. The global problem of childhood diarrhoeal diseases: emerging strategies in prevention and management. Ther Adv Infect Dis. 2018;5(1):29–43. PubMed  Google Scholar  Organization WH. Rotavirus vaccines: WHO position paper–July 2021. Weekly Epidemiol Rec. 2021;96(28):301–219. Google Scholar  Bucardo F, Reyes Y, Svensson…

Continue Reading Genetic characterization of two G8P[8] rotavirus strains isolated in Guangzhou, China, in 2020/21: evidence of genome reassortment | BMC Infectious Diseases

Pooled shRNA Library Screening to Identify Factors that Modulate a Drug Resistance Phenotype

High-throughput RNA interference (RNAi) screening using a pool of lentiviral shRNAs can be a tool to detect therapeutically relevant synthetic lethal targets in malignancies. We provide a pooled shRNA screening approach to investigate the epigenetic effectors in acute myeloid leukemia (AML). The overall goal of the following video is to…

Continue Reading Pooled shRNA Library Screening to Identify Factors that Modulate a Drug Resistance Phenotype

How To Download Geo Data? Update New

Bioinformatics 101 | How to download RNA-Seq data from NCBI GEO | Bioinformatics for beginners Bioinformatics 101 | How to download RNA-Seq data from NCBI GEO | Bioinformatics for beginners Images related to the topicBioinformatics 101 | How to download RNA-Seq data from NCBI GEO | Bioinformatics for beginners Bioinformatics…

Continue Reading How To Download Geo Data? Update New

bgzf_read_block] EOF marker is absent reformat.sh

BBMap/BBTools reformat.sh : real error or spurious message? [W::bgzf_read_block] EOF marker is absent reformat.sh 1 When subsampling paired-end .fastq.gz files using reformat.sh from BBMap/BBTools, I get this error message: [W::bgzf_read_block] EOF marker is absent reformat.sh I’ve checked the input files with gunzip -t, no error. The input files are a…

Continue Reading bgzf_read_block] EOF marker is absent reformat.sh

Allelic expression imbalance of PIK3CA mutations is frequent in breast cancer and prognostically significant

Subjects Normal breast and tumor samples were obtained with the written informed consent from donors and appropriate approval from local ethical committees, with the detailed information described in the respective original publications: normal tissue9, METABRIC14, TCGA35. Differential allelic expression analysis DNA and total RNA from 64 samples of normal breast…

Continue Reading Allelic expression imbalance of PIK3CA mutations is frequent in breast cancer and prognostically significant

Detailed differences between sambamba and samtools

3 month , My first post in the new student group , The false-positive mutation appears because duplicates mark Not enough ?, Tells the story of supplementary read It won’t be GATK MarkDuplicates Marked as duplicates The problem of . after , In response to this question , I began…

Continue Reading Detailed differences between sambamba and samtools

Detection of candidate gene LsACOS5 and development of InDel marker for male sterility by ddRAD-seq and resequencing analysis in lettuce

Ryder, E. J. Lettuce, Endive and Chicory (CABI Publishing, 1999). Google Scholar  Seki, K. et al. A CIN-like TCP transcription factor (LsTCP4) having retrotransposon insertion associates with a shift from Salinas type to Empire type in crisphead lettuce (Lactuca sativa L.). Hortic. Res. 7, 1–14 (2020). Article  Google Scholar  Odland,…

Continue Reading Detection of candidate gene LsACOS5 and development of InDel marker for male sterility by ddRAD-seq and resequencing analysis in lettuce

[W::bgzf_read_block] EOF marker is absent in BBMAP

[W::bgzf_read_block] EOF marker is absent in BBMAP 0 Hello, I’m asking an issue encountered in bbmap. I was using bbmap to remove host contaminants from my microbiome data. The commands are simple as below (ref folder already generated in the last step) bbmap.sh -Xmx42g in=R1.fastq.gz in2=R2.fastq.gz outu=cleaned.interleaved.fastq.gz threads=12 overwrite=t unpigz=t…

Continue Reading [W::bgzf_read_block] EOF marker is absent in BBMAP

Strange Per base sequence content of fastqc

Hi, all! I download fastq.gz files of GSE162708 from ENA which only have 2 files of each sample(usually scRNA-seq has 3 files I1 , R1 & R2 ). Then I run fastp as following Then I get QC report , but I can’t understand why Per base sequence content of…

Continue Reading Strange Per base sequence content of fastqc

tReasure: R-based GUI package analyzing tRNA expression profiles from small RNA sequencing data | BMC Bioinformatics

tReasure (tRNA Expression Analysis Software Utilizing R for Easy use) is a graphical user interface (GUI) tool for the analysis of tRNA expression profiles from deep-sequencing data of small RNAs (small RNA-seq) using R packages. The whole analysis workflow, including the uploading of FASTQ files of small RNA-seq, quantification of…

Continue Reading tReasure: R-based GUI package analyzing tRNA expression profiles from small RNA sequencing data | BMC Bioinformatics

FastQ_7 April 2022(1) – Copy.pptx – What is the FASTA format? The FASTA format is the “workhorse” of bioinformatics. It is used to represent sequence

the FASTA format is not “officially” defined – even though it carries the majority of data information onliving systems. Its origins go back to asoftware tool calledFastawritten byDavidLipman(ascientist that later became, and still is, the director of NCBI) andWilliam R. Pearsonof the University ofVirginia. The tool itself has (to some…

Continue Reading FastQ_7 April 2022(1) – Copy.pptx – What is the FASTA format? The FASTA format is the “workhorse” of bioinformatics. It is used to represent sequence

Reference-based alignment using MUSKET

Reference-based alignment using MUSKET 1 I’m running MUSKET on my dataset trimmed_data.tar.gz using 1000 threads, 2000 threads, and 4000 threads on a HPC. I’ve been unable to obtain any results because the software seems to be running for a long time. ./../musket-1.1/musket -k 90 600000000 -p 1000 -zlib 9 -ino…

Continue Reading Reference-based alignment using MUSKET

(ERR): bowtie2-align exited with value 13

bowtie2 – (ERR): bowtie2-align exited with value 13 1 I am trying to run bowtie2. but following error are occuring everytime bowtie2 –very-fast-local -x bowtie -q -1 R1.fastq -2 R2.fastq -s aligned.sam Saw ASCII character 10 but expected 33-based Phred qual. terminate called after throwing an instance of ‘int’ Aborted…

Continue Reading (ERR): bowtie2-align exited with value 13

Postdoc / Research Scientist in Bioinformatics and Computational Genomics

Job Description Are you a computer geek with a strong interest in genomics? Do you want to use your computational skills to solve human diseases? At the Department of Neurology at Harvard Medical School and Brigham & Women’s Hospital, we have two vacant positions: postdoctoral fellow and research scientist in…

Continue Reading Postdoc / Research Scientist in Bioinformatics and Computational Genomics

Qiime2 Exclude Seqs with FASTQ as query data.

Qiime2 Exclude Seqs with FASTQ as query data. 0 Hello, I am working with FASTQ files and I want to filter them based on the alignment with references sequences in FASTA format. I decided to use QIIME2 for this. So I imported both FASTA and FASTQ files to the required…

Continue Reading Qiime2 Exclude Seqs with FASTQ as query data.

FastQC per base sequence content

FastQC per base sequence content 1 I’m running FastQC on some paired-end fastq files. I have a warning on per-base sequence content, as the first 5 to 6 bases show significant bias towards T and G, as shown below. I was wondering what the sequence in the first 5 or…

Continue Reading FastQC per base sequence content

Validate RNAseq salmon quantification pipeline

Validate RNAseq salmon quantification pipeline 1 Hi, I’ve written a pipeline to perform quantification from RNAseq data with salmon. I’m trying to find a way to evaluate the quality of my results. I was thinking to run the pipeline on available public dataset and compare my output with another analysis….

Continue Reading Validate RNAseq salmon quantification pipeline

can`t find a path for to file

Trimmomatic – can`t find a path for to file 1 I simply need to run Trimmomatic, but he doesn`t see input files. May be you know how to deal with it? #creating variables INPUT_DIR=”path/folderinput” OUTPUT_DIR=”path/folderoutput” APPENDIX=”.fastq.gz” APPENDIX1=”_R1.fastq.gz” APPENDIX2=”_R2.fastq.gz” TRIMMOMATIC=”java -jar /home/path/trimmomatic-0.36.jar” #creating a loop for i in $INPUT_DIR/*$APPENDIX1 do FORWARD=$(basename…

Continue Reading can`t find a path for to file

Mapping back 3 sets of reads/sample with minimap2

I used FaQC to qc my raw fastqs before assembling. That program (and perhaps others) outputs properly paired Forward and Reverse fastqs, as well as an unpaired fastq file for each sample. I used the all 3 for each single sample assembly. Since minimap2 only allows for 2 query files,…

Continue Reading Mapping back 3 sets of reads/sample with minimap2

Sam file is not written

Dear all, It writes the following in the log file: [08-02 01:26:25] Running Step 2: BWA … bwa_wrap /work/pathology/s206442/dbet_project/hg19/hg19.fa Output3/out_1.valid.fastq 6 Output3/out_1.valid.sam 0 Running BWA on trimmed reads … bwa mem -t 6 /work/pathology/s206442/dbet_project/hg19/hg19.fa Output3/out_1.valid.fastq | samtools view -h -F 2048 – > Output3/out_1.valid.sam However, the sam file size is…

Continue Reading Sam file is not written

Mapped reference id is not an id of the genome file genome_nowhitespace.fa

miRDeep2: Mapped reference id is not an id of the genome file genome_nowhitespace.fa 1 Hi everyone, I’m trying to run nf-co.re/smrnaseq pipeline and I’m having a problem with mirdeep2. Command: nextflow run nf-core/smrnaseq -profile ijcluster –input /home/794_both.fastq.gz –outdir /home/results –genome GRCh38 –protocol qiaseq –mature mirbase.org/ftp/CURRENT/mature.fa.gz –hairpin mirbase.org/ftp/CURRENT/hairpin.fa.gz Error message: Command…

Continue Reading Mapped reference id is not an id of the genome file genome_nowhitespace.fa

Separate exogenous from endogenous transcripts using Salmon RNAseq DTU

Dear friends, We are trying to use Salmon for DTU analysis. We want to separate exogenous from endogenous transcripts by following this post www.biostars.org/p/443701/ and this paper f1000research.com/articles/7-952 We are focusing on a gene called ASCL1 (endo-ASCL1). We transduced cells with lentiviral vector containing ASCL1 ORF only (Lenti-ASCL1). There should…

Continue Reading Separate exogenous from endogenous transcripts using Salmon RNAseq DTU

Phylogenomic analysis of Syngnathidae reveals novel relationships, origins of endemic diversity and variable diversification rates | BMC Biology

Stölting KN, Wilson AB. Male pregnancy in seahorses and pipefish: beyond the mammalian model. Bioessays. 2007;29:884–96. PubMed  Google Scholar  Whittington CM, Friesen CR. The evolution and physiology of male pregnancy in syngnathid fishes. Biol Rev Camb Philos Soc. 2020;95:1252–72. PubMed  Google Scholar  Rosenqvist G, Berglund A. Sexual signals and mating…

Continue Reading Phylogenomic analysis of Syngnathidae reveals novel relationships, origins of endemic diversity and variable diversification rates | BMC Biology

BioInformatics Product Manager at Helix (remote)

You + Helix Helix is a place where innovators and doers gather in order to drive significant progress in population genomics. We have come together to work at the intersection of clinical care, research, and genomics.   If you’re excited by the idea of making a meaningful impact and joining a…

Continue Reading BioInformatics Product Manager at Helix (remote)

Trimmomatic/ linux system

Trimmomatic/ linux system 1 Hi all, I am trying to remove adapters and clean my RNA-seq.gz files using Trimmomatic, loaded on a Linux system (supercomputer server) Following the steps for Pair ends reads, explained in the manual (www.usadellab.org/cms/?page=trimmomatic) java -jar trimmomatic-0.39.jar PE input_forward.fq.gz input_reverse.fq.gz output_forward_paired.fq.gz output_forward_unpaired.fq.gz output_reverse_paired.fq.gz output_reverse_unpaired.fq.gz ILLUMINACLIP:TruSeq3-PE.fa:2:30:10:2:True LEADING:3…

Continue Reading Trimmomatic/ linux system

Sequence Duplication Levels failed FastQC Report

Sequence Duplication Levels failed FastQC Report 1 Hi all, I’m checking quality for my RNA-Seq through FastQC and all my fastq failed on “Per base sequence content” and “Sequence Duplication Levels”, besides warning on “Overrepresented sequences” only for read 1 files (it’s paired-end; the sequences match between samples). Below is…

Continue Reading Sequence Duplication Levels failed FastQC Report

Why did I achieve shorter than initial reads subset after aligned reads extraction.

Why did I achieve shorter than initial reads subset after aligned reads extraction. 1 Hello dear colleages! I have recently faced some problem. I have worked with long WGS reads. Firstly I have filtered the longest subset of reads, and aligned them to the custom sequence with several structural variants…

Continue Reading Why did I achieve shorter than initial reads subset after aligned reads extraction.

How to check Fasta file ASCII characters and fix encoding errors?

How to check Fasta file ASCII characters and fix encoding errors? 0 I tried building a diamond database but got this error. Error: Error reading input stream at line 180825: Invalid character (ASCII 0) in sequence How can I fix it? Is there a tool that checks for this and…

Continue Reading How to check Fasta file ASCII characters and fix encoding errors?

Mitogenome of a stink worm (Annelida: Travisiidae) includes degenerate group II intron that is also found in five congeneric species

Tan, M. H. et al. Comparative mitogenomics of the Decapoda reveals evolutionary heterogeneity in architecture and composition. Sci. Rep. 9, 1–16 (2019). ADS  Google Scholar  Zhang, Y. et al. Phylogeny, evolution and mitochondrial gene order rearrangement in scale worms (Aphroditiformia, Annelida). Mol. Phylogenet. Evol. 125, 220–231 (2018). CAS  PubMed  Google…

Continue Reading Mitogenome of a stink worm (Annelida: Travisiidae) includes degenerate group II intron that is also found in five congeneric species

Feature count is very low using htseq-count

Feature count is very low using htseq-count 0 Hello all, I performed bbmap on my RNA-seq paired sequence data using following cmd bbmap.sh in1=J2_R1.fastq in2=J2_R2.fastq out=output_J2.sam ref=im4.fasta nodisk The header of generated sam file is @HD VN:1.4 SO:unsorted @SQ SN:k141_1006 LN:2503 @SQ SN:k141_5512 LN:5393 @SQ SN:k141_4772 LN:4387 @SQ SN:k141_3267 LN:4531…

Continue Reading Feature count is very low using htseq-count

Minimap2 options for Nanopore cDNA direct seq

Minimap2 options for Nanopore cDNA direct seq 0 Hello, I’m working with ONT RNA seq data and I used the cDNA direct seq to do the seq. I want to look for long deletions in mRNAs that are not spliced, for this, I want to use the splice option of…

Continue Reading Minimap2 options for Nanopore cDNA direct seq

Fastp file merge append | Develop Paper

Interpretation of fastq file formatwww.jianshu.com/p/39115d21ee17 Sometimes, the sequencing results of a species will return two double ended fastps.r1.fq.gz l1.fq.gzr2.fq.gz l2.fq.gzThe content of sequencing data is actually one piece, but it is divided into two parts during transmission.When we use it, we are used to merging it into a double ended…

Continue Reading Fastp file merge append | Develop Paper

BTG2 gene predicts poor outcome in PT-DLBCL

Introduction Primary testicular diffuse large B-cell lymphoma (PT-DLBCL) is a rare and aggressive form of mature B-cell lymphoma.1–3 PT-DLBCL was the most common type of testicular tumor in men aged over 60 and characterized by painless uni- or bilateral testicular masses with infrequent constitutional symptoms.4–6 PT-DLBCL shows significant extranodal tropism,…

Continue Reading BTG2 gene predicts poor outcome in PT-DLBCL

High-Throughput Transcriptome Analysis for Investigating Host-Pathogen Interactions

The protocol presented here describes a complete pipeline to analyze RNA-sequencing transcriptome data from raw reads to functional analysis, including quality control and preprocessing steps to advanced statistical analytical approaches. Welcome to the protocol of high-throughput transcriptome analysis for investigating host-pathogen interactions. This protocol is divided in the following steps….

Continue Reading High-Throughput Transcriptome Analysis for Investigating Host-Pathogen Interactions

BBTools – BioGrids Consortium – Supported Software

AllHigh-Throughput SequencingGenomicsProteomicsVisualizationOther BBTools Description a suite of fast, multithreaded bioinformatics tools designed for analysis of DNA and RNA sequence data. BBTools can handle common sequencing file formats such as fastq, fasta, sam, scarf, fasta+qual, compressed or raw, with autodetection of quality encoding and interleaving. Installation Use the following command to…

Continue Reading BBTools – BioGrids Consortium – Supported Software

sorting – indexing sorted alignment file with samtools index gives “Exec format error”

I am struggling with samtools index. I already did the alignment using “bwa mem reference.fa seq.fastq > alg.sam”. The resulting sam file was converted to bam format using “samtools view -S -h -b alg.sam > alg.bam”. Next, the files were sorted by using “sort -h alg.bam >sorted.bam”. And now we…

Continue Reading sorting – indexing sorted alignment file with samtools index gives “Exec format error”

METASnake: a Snakemake workflow to facilitate…

Introduction As sequencing technology has become cheaper and more readily accessible, the need for the increased computational capacity to process these data has become apparent. In particular, high-throughput sequencing has been particularly useful when applied to the field of metagenomics. Substantial effort has been devoted to developing software and computational…

Continue Reading METASnake: a Snakemake workflow to facilitate…

bedtools sample with fastq input and fewer input records than requested

I’m using bedtools sample to sample reads from fastq files. I’d like to submit two feature requests: If the number of requested records is larger than the input I get ERROR: Input file has fewer records than the requested number of output records. I guess this is intentional and not…

Continue Reading bedtools sample with fastq input and fewer input records than requested

Extracellular circulating miRNAs as stress-related signature to search and rescue dogs

Study approval was provided by the Research Ethics Committee of the University of Perugia (report n.2018-21 of 11/12/2018) according to Italian Ministry of Health legislation18. All methods were carried out following relevant guidelines and regulations and the study was carried out in compliance with the ARRIVE guidelines. Informed consent is…

Continue Reading Extracellular circulating miRNAs as stress-related signature to search and rescue dogs

Per base sequence quality – fastqc

Per base sequence quality – fastqc 2 Hi everyone, I am new to bioinformatics, I am asking a very basic question here, I have paired-end fastq data, I did fastqc, and in this per base sequence quality, few reads are in the red region, and there is no adapter and…

Continue Reading Per base sequence quality – fastqc

Genomic variation from an extinct species is retained in the extant radiation following speciation reversal

Vamosi, J. C., Magallon, S., Mayrose, I., Otto, S. P. & Sauquet, H. Macroevolutionary patterns of flowering plant speciation and extinction. Annu. Rev. Plant Biol. 69, 685–706 (2018). CAS  PubMed  Google Scholar  Rhymer, J. M. & Simberloff, D. Extinction by hybridization and introgression. Annu. Rev. Ecol. Syst. 27, 83–109 (1996)….

Continue Reading Genomic variation from an extinct species is retained in the extant radiation following speciation reversal

Analyzing and slicing FASTQ file entries using Python

Analyzing and slicing FASTQ file entries using Python 1 I have the code pasted below for running on FASTQ file entries in order to compare specific parts and remove the redundancy of the same sequences (based on the miRNA + umi_seq combination). I save the entry IDs and then make…

Continue Reading Analyzing and slicing FASTQ file entries using Python

nf-core/circrna

circRNA quantification, differential expression analysis and miRNA target prediction of RNA-Seq data Introduction nf-core/circrna is a best-practice analysis pipeline for the quantification, miRNA target prediction and differential expression analysis of circular RNAs in paired-end RNA sequencing data. The pipeline is built using Nextflow, a workflow tool to run tasks across…

Continue Reading nf-core/circrna

Using AnnoTree to Get More Assignments, Faster, in DIAMOND+MEGAN Microbiome Analysis

INTRODUCTION Next-generation sequencing (NGS) has revolutionized many areas of biological research (1, 2), providing ever-more data at an ever-decreasing cost. One such area is microbiome research, the study of microbes in their theater of activity using metagenomic sequencing (3). Here, deep short-read sequencing, and improving performance of long-read sequencing, are…

Continue Reading Using AnnoTree to Get More Assignments, Faster, in DIAMOND+MEGAN Microbiome Analysis

Kallisto mapping paired end

Kallisto mapping paired end 0 Hello everyone, I am new to bioinformatics and i am trying to use kallisto to map paired end data. However, I got an error by running the command. So does anyone know what did I do wrong here? Thank you! Here is my command: kallisto…

Continue Reading Kallisto mapping paired end

FastQC for paired end data

FastQC for paired end data 2 Hi, I have 36 fastq files of paired end RNA-seq so I was wondering if anyone knows how to do fastqc on paired-end data? and what is the difference between fastqc of single end data? I have done with single end data before but…

Continue Reading FastQC for paired end data

Processing two lists of files with snakemake

I want to use snakemake to do bowtie2 mapping of split read files to a reference genome, and I’d like that rule to be integrated in the general workflow. For that purpose, I first defined a rule to create a bowtie index rule build_bowtie_index: input: referenceGenomeFasta output: expand(“{name}.{index}.bt2”, index=range(1,5), name…

Continue Reading Processing two lists of files with snakemake

RNA-Seq Data Analysis Software – Isogen Lifescience

BlueBee Genomics The BlueBee platform is a production-ready, robust infrastructure that is easy to use for any researcher. It can be used for analysing data from QuantSeq, CORALL, and SLAMseq experiments. There is no prior bioinformatic experience required. Each purchased QuantSeq and CORALL kit includes a code for free data…

Continue Reading RNA-Seq Data Analysis Software – Isogen Lifescience

Find Transposon Element insertions using long reads (nanopore), by alignment directly. (minimap2)

find_te_ins is designed to find Transposon Element (TE) insertions using long reads (nanopore), by alignment directly. (minimap2) Install $ git clone github.com/bakerwm/find_te_ins.git&#13; $ cd find_te_ins Change the following variables upon your condition: genome_fa and te_fa in line-10 and line-11; $ bash run_pipe.sh run_pipe.sh Prerequisite minimap2 – 2.17-r974-dirty, align long…

Continue Reading Find Transposon Element insertions using long reads (nanopore), by alignment directly. (minimap2)

Cell Strain-Derived Induced Pluripotent Stem Cells as an Isogenic Approach To Investigate Age-Related Host Response to Flaviviral Infection

INTRODUCTION Dengue is the most common mosquito-borne viral disease globally (1). This acute disease, which can be life-threatening, is caused by four different dengue viruses (DENVs) (DENV-1, DENV-2, DENV-3, and DENV-4). An estimated 390 million people are infected with these DENVs annually (2), and populations throughout the tropics face frequent…

Continue Reading Cell Strain-Derived Induced Pluripotent Stem Cells as an Isogenic Approach To Investigate Age-Related Host Response to Flaviviral Infection

Error in Rsubread featureCounts

Hi there, Excellent package! I am using it to do RNA-seq. But I encountered a small problem when using featureCounts(). The code is as follows: featureCounts( “A1.raw_1.fastq.gz.subjunc.BAM”, annot.inbuilt = NULL, annot.ext = “GCF_015227675.2_mRatBN7.2_genomic.gtf”, isGTFAnnotationFile=TRUE, isPairedEnd=TRUE, nthreads = 8 ) And it returns this: ========== _____ _ _ ____ _____ ______…

Continue Reading Error in Rsubread featureCounts

Postdoctoral position in bioinformatics – focused on single-cell immune transcriptomics – Karolinska Institute – job portal

Postdoctoral position in bioinformatics – focused on single-cell immune transcriptomics Login and apply Do you want to contribute to improving human health? We are looking for an ambitious postdoctoral fellow with solid genome-wide bioinformatics and computational biology skills to join our highly accomplished team. We offer a stimulating environment in…

Continue Reading Postdoctoral position in bioinformatics – focused on single-cell immune transcriptomics – Karolinska Institute – job portal

Merging compressed fastq files based on a conditions defined in a csv file

Hello everybody, I have a question quite different about similar topic addressed on: Post not found I tried Paul’s bash script in the web indicated above (fastq_lane_merging.sh) adapting to my filename organization data being: #!/bin/bash for i in $(find ./ -type f -name “*.fastq.gz” | while read F; do basename…

Continue Reading Merging compressed fastq files based on a conditions defined in a csv file

Mapping to multiple references using bbmap

So my question comes in two parts: First of all is what I’m trying to do within reason given the tools I am using? I am investigating the shuffling effects of a recombinase on a known reporter sequence which subsequently generates libraries of unique sequences. By simulating all of the…

Continue Reading Mapping to multiple references using bbmap

bwa , 2 files fastq to 1 sam

bwa , 2 files fastq to 1 sam 1 i have this problem, please, help me, I’m trying it too from Mac OS Catalina I am creating a sam file, with 2 fastq files, using bwa I apply the following command bwa mem -t 2 GRCh38.primary_assembly.genome.fa.gz V350019555_L03_B5GHUMqcnrRAABA-556_1.fq.gz V350019555_L03_B5GHUMqcnrRAABA-556_2.fq.gz > V350019555_L03_B5GHUMqcnrRAABA-556.sam…

Continue Reading bwa , 2 files fastq to 1 sam

SeqIO object get cleared away after being accessed

I’m using Biopython to parse a fastq file, and I found that the SeqIO object get cleared away once I accessed it. from Bio import SeqIO record_fastqIO = SeqIO.parse(‘SRR835775_1.first1000.fastq’,’fastq’) for record in record_fastqIO: print(record.id) This script works perfectly. But if I add one line to the script: from Bio import…

Continue Reading SeqIO object get cleared away after being accessed

identify and remove adapter sequence

identify and remove adapter sequence 2 Hi all, I am trying to identify the adapter sequences of my ATAC-sequencing data. The way I tried to achieve this was to send the fastq file to FastQC. Hoping the sequence would be picked and showed in the report. In the report, there…

Continue Reading identify and remove adapter sequence

Petabase-scale sequence alignment catalyses viral discovery

Serratus alignment architecture Serratus (v0.3.0) (github.com/ababaian/serratus) is an open-source cloud-infrastructure designed for ultra-high-throughput sequence alignment against a query sequence or pangenome (Extended Data Fig. 1). Serratus compute costs are dependent on search parameters (expanded discussion available: github.com/ababaian/serratus/wiki/pangenome_design). The nucleotide vertebrate viral pangenome search (bowtie2, database size: 79.8 MB) reached processing rates…

Continue Reading Petabase-scale sequence alignment catalyses viral discovery

Samtools flagstat confusing result of a merged bam file

Hi, I am a bioinformatics student and I am struggling with an issue, I had paired-end fastq files for one sample with some low-quality bases at the end and adapter contamination, so I went and I trimmed my reads with trimmomatic, it gave me 4 files that I used for…

Continue Reading Samtools flagstat confusing result of a merged bam file

R and sra toolkit – odd system() behavior ( R, System )

Problem : ( Scroll to solution ) In order to extract some fastq data from NCBI’s sequence read archive I’ve downloaded and installed the sra toolkit for Windows. In order to test if it is setup correctly, I opened cmd, navigated to the directory and typed in the command fasterq-dump…

Continue Reading R and sra toolkit – odd system() behavior ( R, System )

The role of ATXR6 expression in modulating genome stability and transposable element repression in Arabidopsis

Significance The plant-specific H3K27me1 methyltransferases ATXR5 and ATXR6 play integral roles connecting epigenetic silencing with genomic stability. However, how H3K27me1 relates to these processes is poorly understood. In this study, we performed a comprehensive transcriptome analysis of tissue- and ploidy-specific expression in a hypomorphic atxr5/6 mutant and revealed that the…

Continue Reading The role of ATXR6 expression in modulating genome stability and transposable element repression in Arabidopsis

Any alternatives to BBMap’s clumpify.sh program to optimize gzip compression?

Any alternatives to BBMap’s clumpify.sh program to optimize gzip compression? 1 I’ve had some difficulties implementing this in pipelines because it randomly fails sometimes. Are there any other programs that can be used in its stead? fastq genomics rnaseq • 201 views • link updated 7 hours ago by GenoMax…

Continue Reading Any alternatives to BBMap’s clumpify.sh program to optimize gzip compression?

ChaoXianSen/TrimGalore – Giters

Trim Galore is a wrapper around Cutadapt and FastQC to consistently apply adapter and quality trimming to FastQ files, with extra functionality for RRBS data. Installation Trim Galore is a a Perl wrapper around two tools: Cutadapt and FastQC. To use, ensure that these two pieces of software are available…

Continue Reading ChaoXianSen/TrimGalore – Giters

Mle Application With Gekko In Python

The true power of the state space model is to allow the creation and estimation of custom models.This notebook shows various statespace models that subclass sm. That means your MAGeCK python module is installed in /home/john/.pyenv/versions/2.7.13/lib/python2.7/sitepackages.I use conda to install the latest version of. This twovolume set Diseases and Pathology…

Continue Reading Mle Application With Gekko In Python

[lh3/minimap2] Memory leak when using Python and threads

The program align.py uses mappy to align reads in Python using multiple worker threads. After loading the index the memory usage jumps up quickly to >20Gb and then continues to climb steadily through 40Gb an beyond. This issue was first discovered in bonito and isolated to mappy. The data flow…

Continue Reading [lh3/minimap2] Memory leak when using Python and threads

Bwa on multiple processor

Hi Guys, When I am trying to run bwa mem on multiple processor, I am getting error as : > mpirun -np 16 bwa mem hg19-agilent.fasta R1.fastq R2.fastq | samtools sort -o aln.bam [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::bwa_idx_load_from_disk] read…

Continue Reading Bwa on multiple processor

python – Missing input files after defining them in function

I am trying to do QC on RNAseq data that is tarballed. I am using Snakemake as a workflow manager and am aware that Snakemake does not like one-to-many rules. I defining a checkpoint would fix the problem but when I run the script I get this this error message…

Continue Reading python – Missing input files after defining them in function

Aligning multiple single and paired-end reads from multiple files (lanes)

Rsubread: Aligning multiple single and paired-end reads from multiple files (lanes) 0 Hello, I am new to bioinformatics and looking for some help. I have 27 files from an Illumina output. There are 4 paired end and 23 single read files. I am trying to align them using Rsubread in…

Continue Reading Aligning multiple single and paired-end reads from multiple files (lanes)

RedChIP identifies noncoding RNAs associated with genomic sites occupied by Polycomb and CTCF proteins

Abstract Nuclear noncoding RNAs (ncRNAs) are key regulators of gene expression and chromatin organization. The progress in studying nuclear ncRNAs depends on the ability to identify the genome-wide spectrum of contacts of ncRNAs with chromatin. To address this question, a panel of RNA–DNA proximity ligation techniques has been developed. However,…

Continue Reading RedChIP identifies noncoding RNAs associated with genomic sites occupied by Polycomb and CTCF proteins

tranfering sam file easy and fast way

tranfering sam file easy and fast way 0 Hi everyone I was tried to align my fastq files by hisat2 but ı couldnot able done because my computer has 4gb ram and ı get error killed. So ı was perfomed process on my friend computer but now I should solve…

Continue Reading tranfering sam file easy and fast way

Alignment report

Alignment report 0 Hi Guys, I did alignment of R1 and R2 fastq files with reference genome using bwa mem and got bam file. Now, I want to check whether the alignment is done correctly and alignment percentage,coverage etc. I run following command: bwa mem hg19.fasta R1.fastq R2.fastq | samtools…

Continue Reading Alignment report

how to align paired and unpaired fastq files of a sample using STAR?

how to align paired and unpaired fastq files of a sample using STAR? 2 Hi all I’m new to using STAR aligner. I have PE sequencing fastq files which have forward and reverse pairs and forward and reverse unpairs reads (4 files). In the manual of this tool, it seems…

Continue Reading how to align paired and unpaired fastq files of a sample using STAR?

sequence alignment – Help with MinION sequencing data species identification

Hi I’m new to bioinformatics and have just completed my first run on the MinION (long read sequencing Oxford Nanopore Technologies). I was hoping someone could direct me towards R packages, workflow, tutorials or guides that will help me identify species that are present in my sample mainly for fungi…

Continue Reading sequence alignment – Help with MinION sequencing data species identification