Tag: MAPQ

Filtering mitochondrial reads from ATAC-Seq aligned reads- what to do with reads that have MT in RNEXT field

Hi all, I am trying to filter mitochondrial reads from my ATAC-seq data after trimming with Trimmomatic and then aligning with Bowtie2. After searching through many pipelines, I have found 2 ways that people often do this (both using inputs that are sorted and indexed BAM files): 1) with samtools…

Continue Reading Filtering mitochondrial reads from ATAC-Seq aligned reads- what to do with reads that have MT in RNEXT field

Reads with highest MAPQ values from SAM files are showing mismatches to reference sequence and IGV classified them as supplementary reads

Hi all, I am expressing a GFP synonymous variant library in human cells and sequencing its RNA on the nanopore and I am having some trouble analysing the data. Initially, I basecalled all the fast5 files using the super accuracy model in the guppy basecaller, then I discarded the reads…

Continue Reading Reads with highest MAPQ values from SAM files are showing mismatches to reference sequence and IGV classified them as supplementary reads

minimap’s SAM file MAPQ value for the unique alignments

minimap’s SAM file MAPQ value for the unique alignments 0 Hi, I have mapped my reads from nanopore RNAseq to the reference database using the minimap. Minimap SAM files have MAPQ values from 0 to 60 in the fifth column. While 0 indicates the reads that can possibly map to…

Continue Reading minimap’s SAM file MAPQ value for the unique alignments

Upstream pseudogene causing MAPQ 0 and exclusion during variant calling

Upstream pseudogene causing MAPQ 0 and exclusion during variant calling 0 Hello, just upstream of GBA1 is a pseudogene that is quite similar to GBA1, this makes most of my mapped reads have MAPQ 0 across this region (I think? I use bwa-mem2 with default settings), and the variants are…

Continue Reading Upstream pseudogene causing MAPQ 0 and exclusion during variant calling

MAPQ filtering for clinical applications

A discussion recently arose about how one ought to filter MAPQ in a clinical setting, i.e., where a NGS sample is being processed in order to produce a result for a patient who has an unknown or hypothesised diagnosis. The result could obviously be key. It was suggested by a…

Continue Reading MAPQ filtering for clinical applications

High number of duplicates and low percentage properly paired

High number of duplicates and low percentage properly paired 0 I have some paired end sequencing data that I have trimmed using cutadapt. It was sequenced on an illumina novaseq 6000 and is low coverage RADseq data (2-3x). My cutadapt script used forward and reverse adapters from illumina : cutadapt…

Continue Reading High number of duplicates and low percentage properly paired

Genecount-difference between HT-seq count, RSEM, and Kallisto

Genecount-difference between HT-seq count, RSEM, and Kallisto 0 Hi I ran three genecount software tools (ht-seq, RSEM, Kallisto) to calculate genecount of RNA-seq data. For Ht-seq, i used STAR aligned Transcriptomesortedcordinate.bam file and defautl MAPQ score with intersection_nonempty mode. For RSEM, i used STAR aligner (used .gtf for building reference)…

Continue Reading Genecount-difference between HT-seq count, RSEM, and Kallisto

10x Cellranger SN RNA-seq BAM file output looks off

I’m working with SN RNA-seq data that I processed using Cellranger v. 7.1.0 (which uses STAR for alignment with default settings). Using the samtools flagstat command to look at a BAM file output: 284890368 + 0 in total (QC-passed reads + QC-failed reads) 0 + 0 secondary 0 + 0…

Continue Reading 10x Cellranger SN RNA-seq BAM file output looks off

Calling zero mapping quality variant

Calling zero mapping quality variant 0 Hello Is it possible to call variant with read that has zero mapping quality at the region? I found that there is INDEL in my BAM file when I visualize in IGV but the variant is not in gVCF, I have checked the average…

Continue Reading Calling zero mapping quality variant

Can’t convert paired end BAM to bed using bedtools

Can’t convert paired end BAM to bed using bedtools 0 Hello, I got .bam files from my pipeline and i merged them with samtools merge ALL.bam *.bam and I got this % of mapping 3825773 + 0 in total (QC-passed reads + QC-failed reads) 3825773 + 0 primary 0 +…

Continue Reading Can’t convert paired end BAM to bed using bedtools

fail to open index BAM file”

Dear all, I applied CNVkit to call copy number variations in a case-control study, with vascular malformation samples as case and healthy individuals as control using the following command. /ifs9/B2C_RD_P5/lijia3/CONDA/bin/python /ifs9/B2C_RD_P5/lijia3/CNVkit/cnvkit/cnvkit.py batch /zfsyt1/B2C_RD_P5/lijia3/VM/*.bam –normal /zfsyt1/B2C_RD_P5/lijia3/VM_controls/Control/Control-Kid/*.bam –targets BGI_Exome_V4_kit_200bp.bed –antitargets my_antitargets.bed –annotate refFlat.txt –fasta /ifs7/B2C_SGD/PROJECT/PP12_Project/analysis_pipeline/HPC_chip/db/aln_db/hg19/hg19_chM_male_mask.fa –access access.ucsc.hg19.bed –output-reference VM_normal_adult_reference.cnn –output-dir results_Kid/ –diagram…

Continue Reading fail to open index BAM file”

PHG Load haplotype and create consensus

Here, presented my PHG scripts, config, wgs_keyfile. 1. Create valid intervals docker run –name test_assemblies –rm -v /DATA/jysong/PHG/ver1.0_phg/:/phg/ -t maizegenetics/phg:1.0 /tassel-5-standalone/run_pipeline.pl -Xmx100G -debug -configParameters /phg/Masterconfig.txt -CreateValidIntervalsFilePlugin -intervalsFile /phg/inputDir/reference/glyma.Wm82.gnm4.ann1.T8TQ.gene_models_main.bed -referenceFasta /phg/inputDir/reference/glyma.Wm82.gnm4.4PTR.genome_main.fixed.fna.gz -mergeOverlaps true -generatedFile /phg/validBedFile.bed -endPlugin &> Log/1.Create_validinterval.txt & 2. Create initial DB docker run –name create_initial_db –rm -v /DATA/jysong/PHG/ver1.0_phg/:/phg/ -t…

Continue Reading PHG Load haplotype and create consensus

Interpreting the meaning of outward facing reads

Interpreting the meaning of outward facing reads 0 Sorry i have lots of data and im not sure how to interpret. I took mRNA created by IVT from our plasmid and performed paired end Illumina sequencing.I used samtools stats to get info about the various samples: read length ~150, insert…

Continue Reading Interpreting the meaning of outward facing reads

Extracting list of target genes from ChIP-seq bigwig file

Extracting list of target genes from ChIP-seq bigwig file 1 I am reading a publication where the authors performed ChIP-seq on a specific TF. Their data was deposited on NCBI in bigwig format. I would like to extract the final list of target genes. Is this possible using the bigwig…

Continue Reading Extracting list of target genes from ChIP-seq bigwig file

Compressing BAM, SAM, CRAM | Genozip

How good is Genozip at compressing BAM files? ​ See Benchmarks. ​ Compressing a BAM, SAM or CRAM file  ​ In the rest of this page we will give examples of BAM files. Genozip is also capable of compressing SAM files, and with some limitations, CRAM files as well. ​…

Continue Reading Compressing BAM, SAM, CRAM | Genozip

featureCounts paired end reads wrongly assigned on the single end mode

featureCounts paired end reads wrongly assigned on the single end mode 0 Hello! I have RNAseq data from samples containing 2 to 3 bacterial strains each. For the analysis I performed FastQC, trimmomatic to remove adapters and then Bowtie2 to align the reads to the reference genomes. I have one…

Continue Reading featureCounts paired end reads wrongly assigned on the single end mode

Samtools Convert Sam To Bam With Code Examples

Samtools Convert Sam To Bam With Code Examples In this session, we’ll try our hand at solving the Samtools Convert Sam To Bam puzzle by using the computer language. The code that follows serves to illustrate this point. # Basic syntax: samtools view -S -b sam_file.sam > bam_file.bam # Where:…

Continue Reading Samtools Convert Sam To Bam With Code Examples

different result using minimap2 and pbmm2

Hi all! I am analysing CSS Pacbio data and each sample came from different run, in particular I have three files for each sample. I tested both pbmm2 and minimap2 to align my long reads, after getting the consensus sequences. This is the command I used to run mnimap2: minimap2…

Continue Reading different result using minimap2 and pbmm2

pjotrp/sambamba – sambamba – Genenetwork

10 years ago ​ 10 years ago ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ 10 years ago ​ ​ ​ ​ ​ ​ 10 years ago 10 years ago 10 years ago ​ ​ 10 years ago ​ 10 years…

Continue Reading pjotrp/sambamba – sambamba – Genenetwork

I wonder if someone please explain what secondary, supplementary, duplicates and paired in sequencing mean in samtools flagstat

I wonder if someone please explain what secondary, supplementary, duplicates and paired in sequencing mean in samtools flagstat 0 ”194492 + 0 in total (QC-passed reads + QC-failed reads) 80 + 0 secondary 0 + 0 supplementary 0 + 0 duplicates 193804 + 0 mapped (99.65% : N/A) 194412 +…

Continue Reading I wonder if someone please explain what secondary, supplementary, duplicates and paired in sequencing mean in samtools flagstat

Ubuntu Manpage: sambamba-view – tool for extracting information from SAM/BAM files

Provided by: sambamba_0.8.2+dfsg-2_amd64 NAME sambamba-view – tool for extracting information from SAM/BAM files SYNOPSIS sambamba view OPTIONS <input.bam | input.sam> [region1 […]] DESCRIPTION sambamba view allows to efficiently filter SAM/BAM files for alignments satisfying various conditions, as well as access its SAM header and information about reference sequences. In order…

Continue Reading Ubuntu Manpage: sambamba-view – tool for extracting information from SAM/BAM files

sam – Use Htslib to create auxilary tags in bam file C++

I am creating a threaded c++ file where i generate in silico bam files, using header, DNA sequence and read information. First i use bam_init1() to create the bam1_t structure just named “b”. Then i use bam_set1 to create the actual sequence entry in the bam file bam_set1(b,read_id_length,READ_ID,flag,chr_idx,min_beg,mapq,n_cigar,cigar,-1,-1,0,strlen(DNAsequence),DNAsequence,quality_string,l_aux) And finally…

Continue Reading sam – Use Htslib to create auxilary tags in bam file C++

Samtools flagstat confusing result of a merged bam file

Hi, I am a bioinformatics student and I am struggling with an issue, I had paired-end fastq files for one sample with some low-quality bases at the end and adapter contamination, so I went and I trimmed my reads with trimmomatic, it gave me 4 files that I used for…

Continue Reading Samtools flagstat confusing result of a merged bam file

[SOLVED] changing the order of input changes samtools merge ouput

I realized that this is a stupid mistake I have made. Since samtools do not overwrite the files by default, the output that I get from samtools merge output.bam f2.bam f1.bam wan’t what I thought it was below is my original post ++++++++++++++++++++++++++ I’m using samtool/1.9.0 and I’m trying to…

Continue Reading [SOLVED] changing the order of input changes samtools merge ouput

Overestimation of number of reads from nanopore data (flagstat)

Same issue as mentioned on the minimap2 tool: github.com/lh3/minimap2/issues/236#issue-361097444 For example nanopore reads aligned to the host transcriptome the flagstat output is: 5953480 + 0 in total (QC-passed reads + QC-failed reads) 2961480 + 0 secondary 22696 + 0 supplementary 0 + 0 duplicates 4195469 + 0 mapped (70.47% :…

Continue Reading Overestimation of number of reads from nanopore data (flagstat)

Samtools flagstat

Samtools flagstat 1 I aligned my ONT sequencing run with minimap2, subsequently I filtered the file using samtools view -b -F 256 aln_transcriptome_sorted_6.bam -o filtered_aln_transcriptome_6.bam to end up with primary alignments only. When I run samtools flagstat on the filtered file I get the following output: 3502608 + 0 in…

Continue Reading Samtools flagstat

Primer design for variants in duplicated genes?

The issues you face relate to the fact that the majority of the genome exhibits sequence similarity, i.e., similarity with other regions in the genome. Much of this is indeed related to gene duplication events, with the duplicated genes acquiring new functionality over time due to mutations. As a rough…

Continue Reading Primer design for variants in duplicated genes?

Single-cell DNA and RNA sequencing reveals the dynamics of intra-tumor heterogeneity in a colorectal cancer model | BMC Biology

Organoid culture of small intestinal cells and lentiviral transduction C57BL/6J mice and BALB/cAnu/nu immune-deficient nude mice were purchased from CLEA Japan (Tokyo, Japan). The small intestine was harvested from wild-type male C57BL/6J mice at 3–5 weeks of age (Additional file 1: Figure S9A). Crypts were purified and dissociated into single cells,…

Continue Reading Single-cell DNA and RNA sequencing reveals the dynamics of intra-tumor heterogeneity in a colorectal cancer model | BMC Biology

MAPQ (Mapping quality) of 0 for most reads from BWA-MEM2 (with no secondary alignment or other apparent reason)

Hello, I got a very weird output from BWA-mem2 – most of the reads have mapping quality of 0, even though there is no secondary alignment or anything else suspicious. I got sequencing data that was aligned with Novoalign to hg18, the data was bam files. I needed to realign…

Continue Reading MAPQ (Mapping quality) of 0 for most reads from BWA-MEM2 (with no secondary alignment or other apparent reason)

Output of samtools view, what does the third column actually represent?

The samtools view outputs information from SAM and BAM files in SAM format. You can find a description of the SAM format here: samtools.github.io/hts-specs/SAMv1.pdf Section 1.4 deals with the meaning of each of the manditory coloumns. It includes the following table: Col Field Type Regexp/Range Brief description |—|——|——-|—————————-|—————————————-| 1 QNAME…

Continue Reading Output of samtools view, what does the third column actually represent?

qualimap2 mean mapping quality

qualimap2 mean mapping quality 0 I’ve done a contrast experiment to see the difference between the bam with BQSR and the bam without BQSR. I use qualimap to evaluate both bams. This is the confusing part. Using hap.py and the giab na12878 truth vcf, shows the bam with BQSR is…

Continue Reading qualimap2 mean mapping quality

How To Filter Mapped Reads With Samtools

Hi, You get a bam (machine readable sam) file after mapping, and it contains information about mapped and unmapped reads. To get the unmapped reads from a bam file use: samtools view -f 4 file.bam > unmapped.sam the output will be in sam to get the output in bam, use:…

Continue Reading How To Filter Mapped Reads With Samtools

Calling variants on reads with MAPQ=0 on HaplotypeCaller or bcftools mpileup

Calling variants on reads with MAPQ=0 on HaplotypeCaller or bcftools mpileup 2 I am working with about 500 samples of human exome data. used hg19 to align my reads and ran a standard best-practices GATK workflow. Later only to realise that a small 1Mb loci has not mapped properly due…

Continue Reading Calling variants on reads with MAPQ=0 on HaplotypeCaller or bcftools mpileup

Mapping quality and XS score

Mapping quality and XS score 0 Hi, I am currently looking through bam files using igv to manually check if the mutations not called by Mutect2 are really an error or not. Until now, to filter reads with low quality, I have used only MAPQ > 30 and didn’t consider…

Continue Reading Mapping quality and XS score

How to properly combine two bam files of a paired-end data

How to properly combine two bam files of a paired-end data 3 Hi all! I am mapping a paired-end read separately using bowtie2. After that, I want to combine the two bam file into one for downstream analysis. How to properly do this combination? I tried: samtools sort -n R1.bam…

Continue Reading How to properly combine two bam files of a paired-end data