Tag: BAM

UMItools dedup deduplication taking too much time + RAM

I have some RNAseq data from miRNAs that I have processed with Bowtie2 (aligning to miRBase). Now, when doing the deduplication with umi_tools dedup I find that some of the files take a lot of time+RAM to finish (some files take around 3-4 minutes and 4-5GB of RAM and some…

Continue Reading UMItools dedup deduplication taking too much time + RAM

ZP77 – YFull YTree Info

R-ZP77 – YFull YTree Info SNPs currently defining R-ZP77 ZP77 / FGC6562     Sample ID Country / Language Info Ref File Testing company Statistics Status YF008362 —— R-ZP77* —— Hg19 .BAM FTDNA (Y500) 41X, 13.8 Mbp, 165 bp YF067652 Unknown R-BY40744 —— Hg38 .BAM FTDNA (Y700) 36X, 18.7 Mbp, 151…

Continue Reading ZP77 – YFull YTree Info

Petabase-scale sequence alignment catalyses viral discovery

Serratus alignment architecture Serratus (v0.3.0) (github.com/ababaian/serratus) is an open-source cloud-infrastructure designed for ultra-high-throughput sequence alignment against a query sequence or pangenome (Extended Data Fig. 1). Serratus compute costs are dependent on search parameters (expanded discussion available: github.com/ababaian/serratus/wiki/pangenome_design). The nucleotide vertebrate viral pangenome search (bowtie2, database size: 79.8 MB) reached processing rates…

Continue Reading Petabase-scale sequence alignment catalyses viral discovery

Efficiently merge two BAM files while retaining reads from only one file in overlapping regions

Efficiently merge two BAM files while retaining reads from only one file in overlapping regions 1 I have a WGS BAM file that is fairly large (>150GB) and a smaller BAM file (<5GB) with reads in a small 10Mbp region. I want to (efficiently) merge the two BAM files while…

Continue Reading Efficiently merge two BAM files while retaining reads from only one file in overlapping regions

variant – Error running gatk HaplotypeCaller with allele specific annotations

I’ve got HaplotypeCaller working nicely in standard mode, like so: # Run haplotypcaller gatk –java-options “-Xmx4g” HaplotypeCaller –intervals “$INTERVALS” -R “$REF” -I “$OUT”/results/alignment/${SN}_sorted_marked_recalibrated.bam -O “$OUT”/results/variants/${SN}_g.vcf.gz -ERC GVCF But when I try in allele-specific mode, I get the following error. All I’ve done is add the -G annotations at the end,…

Continue Reading variant – Error running gatk HaplotypeCaller with allele specific annotations

Read bam/cram file with IGV from aws s3

Hi all, We store our alignment files on aws s3. I would like to be able to open them with IGV without needing to download them completely, but I can’t find an optimal solution. If I get a pre-signed url it works but it’s not convenient. I try to follow…

Continue Reading Read bam/cram file with IGV from aws s3

Samtools flagstat confusing result of a merged bam file

Hi, I am a bioinformatics student and I am struggling with an issue, I had paired-end fastq files for one sample with some low-quality bases at the end and adapter contamination, so I went and I trimmed my reads with trimmomatic, it gave me 4 files that I used for…

Continue Reading Samtools flagstat confusing result of a merged bam file

Ubuntu Manpage: samtools reheader – replaces the header in the input file

Provided by: samtools_1.13-2_amd64 NAME samtools reheader – replaces the header in the input file SYNOPSIS samtools reheader [-iP] [-c CMD | in.header.sam ] in.bam DESCRIPTION Replace the header in in.bam with the header in in.header.sam. This command is much faster than replacing the header with a BAM→SAM→BAM conversion. By default…

Continue Reading Ubuntu Manpage: samtools reheader – replaces the header in the input file

Unable to convert from sam to bam file.

Unable to convert from sam to bam file. 0 samtools view -S -b BD143_TGACCA_L005.sam -o BD143_TGACCA_L005.bam When I am running this command the following error is appearing: [main_samview] fail to read the header from “BD143_TGACCA_L005.sam”. As a result, if anyone knows how to fix this error and thanks. converting File…

Continue Reading Unable to convert from sam to bam file.

samtools sort

samtools sort 1 I am transforming sam files to bam, to facilitate their ordering I use this command, % cd /Volumes/GENOMA/BWA % samtools sort -n -O V350019555_L03_B5GHUMqcnrRAABA-551.sam | samtools fixmate -m -O bam V350019555_L03_B5GHUMqcnrRAABA-551.bam but it gives me the following error, As elsewhere in samtools, use ‘-‘ as the filename…

Continue Reading samtools sort

[SOLVED] changing the order of input changes samtools merge ouput

I realized that this is a stupid mistake I have made. Since samtools do not overwrite the files by default, the output that I get from samtools merge output.bam f2.bam f1.bam wan’t what I thought it was below is my original post ++++++++++++++++++++++++++ I’m using samtool/1.9.0 and I’m trying to…

Continue Reading [SOLVED] changing the order of input changes samtools merge ouput

Estimating individual mtDNA haplotypes in mixed DNA samples by combining MinION and MiSeq

doi: 10.1007/s00414-021-02763-0. Online ahead of print. Affiliations Expand Affiliations 1 Department of Forensic Medicine, Juntendo University School of Medicine, 2-1-1, Hongo, Bunkyo-Ku, Tokyo, 113-8421, Japan. hnakani@juntendo.ac.jp. 2 Department of Forensic Medicine, Saitama Medical University, 38 Morohongo, Moroyama, Saitama, 350-0495, Japan. 3 Department of Forensic Medicine, Juntendo University School of Medicine,…

Continue Reading Estimating individual mtDNA haplotypes in mixed DNA samples by combining MinION and MiSeq

Issue running MACS3

I am having issues running MACS3. I installed MACS3 using: wget github.com/macs3-project/MACS/archive/refs/tags/v3.0.0a6.tar.gz tar -xf v3.0.0a6.tar.gz chmod a+rwx MACS-3.0.0a6/bin/macs3 It appears to be installed correctly because the following code generates the predictd help window: MACS-3.0.0a6/bin/macs3 predictd –help However, when I try running the actual code I get the following error: MACS-3.0.0a6/bin/macs3…

Continue Reading Issue running MACS3

mergue bam itv

mergue bam itv 0 I am trying to create a combined file b m, to enter all the readings, but it gives me an error when loading In a Mac text editor, I enter the path of the three files, and save it with the extension bam.list I introduce HARD…

Continue Reading mergue bam itv

Bwa on multiple processor

Hi Guys, When I am trying to run bwa mem on multiple processor, I am getting error as : > mpirun -np 16 bwa mem hg19-agilent.fasta R1.fastq R2.fastq | samtools sort -o aln.bam [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::bwa_idx_load_from_disk] read…

Continue Reading Bwa on multiple processor

processing in strelka2 with multiples bam file in directory

processing in strelka2 with multiples bam file in directory 0 If I manually tell strelka2 to use these three bam files below, then I get the desired results of 3 individually genome files in results/variants. xxx_00.bam yyy_01.bam zzz_02.bam ${path_to_strelka}/bin/configureStrelkaGermlineWorkflow.py –bam xxx_00.bam –bam yyy_01.bam –bam zzz_02 –referenceFasta <fasta> –callRegions <.bed.gz> –runDir…

Continue Reading processing in strelka2 with multiples bam file in directory

Aligning multiple single and paired-end reads from multiple files (lanes)

Rsubread: Aligning multiple single and paired-end reads from multiple files (lanes) 0 Hello, I am new to bioinformatics and looking for some help. I have 27 files from an Illumina output. There are 4 paired end and 23 single read files. I am trying to align them using Rsubread in…

Continue Reading Aligning multiple single and paired-end reads from multiple files (lanes)

Samtools flagstat

Samtools flagstat 1 I aligned my ONT sequencing run with minimap2, subsequently I filtered the file using samtools view -b -F 256 aln_transcriptome_sorted_6.bam -o filtered_aln_transcriptome_6.bam to end up with primary alignments only. When I run samtools flagstat on the filtered file I get the following output: 3502608 + 0 in…

Continue Reading Samtools flagstat

Alignment report

Alignment report 0 Hi Guys, I did alignment of R1 and R2 fastq files with reference genome using bwa mem and got bam file. Now, I want to check whether the alignment is done correctly and alignment percentage,coverage etc. I run following command: bwa mem hg19.fasta R1.fastq R2.fastq | samtools…

Continue Reading Alignment report

Running samtools view on bam affects the number of variants called by both haplotypecaller and deepvariant – C samtools

Thanks for getting back to me Valeriu. As you suggested, I used the latest commit from the develop branch in my pipeline, and the results look good. I was able to replicate the numbers from samtools v1.10.2 and v1.11 for both variant callers. FYI $ docker run scilifelabram/htslib:dev_proper /opt/samtools/samtools version…

Continue Reading Running samtools view on bam affects the number of variants called by both haplotypecaller and deepvariant – C samtools

bedtools intersect error: Invalid record in file

Hello to all I am trying to run bedtools intersect with vcf file and a bed file (my goal is to add the depth data to my VCF) I get an error running this command: bedtools intersect -a depth.bed -b fish.vcf -wa -wb > $out The error: “Error: Invalid record…

Continue Reading bedtools intersect error: Invalid record in file

Rockhopper’s alignment issue

Rockhopper’s alignment issue 0 Hi everyone, I’m trying to identify the operons with the Rockhopper tool but at the end of the alignment something strange happens: Aligning sequencing reads from file: SRR6757591_1.sorted.bam Total reads: 10209877 Successfully aligned reads: 9358027 92% (>NC_000913.3 Escherichia coli str. K-12 substr. MG1655, complete genome) Aligning…

Continue Reading Rockhopper’s alignment issue

VEP issue: ERROR: Cache assembly version (GRCh37) and database or selected assembly version (GRCh38) do not match

Describe the issue VEP give errors even my query and reference has same assembly version Command :$: ./vep -i examples/homo_sapiens_GRCh37.vcf –cache –refseq cache reference details while running install.pl ? 458 NB: Remember to use –refseq when running the VEP with this cache! downloading ftp.ensembl.org/pub/release-104/variation/indexed_vep_cache/homo_sapiens_refseq_vep_104_GRCh37.tar.gz unpacking homo_sapiens_refseq_vep_104_GRCh37.tar.gz converting cache, this may…

Continue Reading VEP issue: ERROR: Cache assembly version (GRCh37) and database or selected assembly version (GRCh38) do not match

Access read group tag for BAM reads – rust-htslib

I’m sure this is probably straightforward, but I can’t figure out how to access the read group information on a per-read basis within a BAM file. I can get it from the header, but I can’t determine how to use aux or any other record method to get that info….

Continue Reading Access read group tag for BAM reads – rust-htslib

“Paired-end reads were detected in single-end read library”

“Paired-end reads were detected in single-end read library” 0 @9cb59de3 Last seen 12 hours ago United States Hello, I am using “featureCounts” in Rsubread package for analyzing bulk RNA-seq of drosophila. Since there is no inbuilt annotations of drosophila, I am trying to use a gtf file in the homepage…

Continue Reading “Paired-end reads were detected in single-end read library”

samtools sorts allocate memory for bam_mem issues

samtools sorts allocate memory for bam_mem issues 1 Hello everyone ı am trying to convert sam to bam samtools sort -@ 8 -o UHR_Rep1.bam UHR_Rep1.sam and ı got this error samtools sort: couldn’t allocate memory for bam_mem ı check my disk memory and ı see have enough space in my…

Continue Reading samtools sorts allocate memory for bam_mem issues

SNP associated contigs

SNP associated contigs 0 I am using usegalaxy.org for SNP analysis. After mapping with bowtie2 I got BAM file including QNAME FLAG RNAME POS and on variant calling I got Chrom Pos ID Ref Alt. Showing Variant position on reference genome. I want to get those sequences associated to SNP….

Continue Reading SNP associated contigs

sequence alignment – MarkDuplicatesSpark failing with cryptic error message. MarkDuplicates succeeds

[*] I have been trying to follow the GATK Best Practice Workflow for ‘Data pre-processing for variant discovery’ (gatk.broadinstitute.org/hc/en-us/articles/360035535912). This has all been run on Windows Subsystem for Linux 2 on the Bash shell. I started off with FASTQ files from IGSR (www.internationalgenome.org/data-portal) and performed alignment with Bowtie2 (instead of…

Continue Reading sequence alignment – MarkDuplicatesSpark failing with cryptic error message. MarkDuplicates succeeds

Which is @RG read group in the head of bam files?

Which is @RG read group in the head of bam files? 1 I am not sure which part of the head of bam files is @RG read group. I am curious why all of my samples have the sameA01494:44:H53Y7DMXY:1 part? Are they read groups? samtools view -S Sample_7R-MDV_IGO_09530_H_1_dedup.bam | head…

Continue Reading Which is @RG read group in the head of bam files?

PathSeqFilterSpark

PathSeqFilterSpark 0 I have been trying to filter out low-quality bases on my task to conduct a variant annotation, meanwhile, I have completed all previous steps required. However, when I try to filter out low-quality bases after BQSR (GATK), the PathSeqFilterSpark did not yield a output file. There was no…

Continue Reading PathSeqFilterSpark

bedtools genomecov problem with merged bam

Hi, I was using puge haplotig, and in that work flow the first step was to use bedtools genomecov so I moved here. I have three paired end dataset, illumina wgs reads, HiC reads, and Chicago sequencing reads. I aligned the paired end reads of illumina wgs to the genome,…

Continue Reading bedtools genomecov problem with merged bam

Average Read length

Average Read length 3 Hello Everyone! Is there a standard tool commonly used to calculate the average read length of fastq files? If yes please mention it here because I want to know the size of average reads of my fastq files so that I can decide the cutoff for…

Continue Reading Average Read length

Benchmarking the NVIDIA Clara Parabricks germline pipeline on AWS

This blog post was contributed by Ankit Sethia, PhD, and Timothy Harkins, PhD, at NVIDIA Parabricks, and Olivia Choudhury, PhD,  Sujaya Srinivasan, and Aniket Deshpande at AWS. This blog provides an overview of NVIDIA’s Clara Parabricks along with a guide on how to use Parabricks within the AWS Marketplace. It…

Continue Reading Benchmarking the NVIDIA Clara Parabricks germline pipeline on AWS

samtools mpileup error – 1 samples in 1 input files

samtools mpileup error – 1 samples in 1 input files 0 Hi All, I have relatively new to bioinformatics and have encountered an issue when trying to generate an mpileup file with samtools. I have entered the following command samtools mpileup -f /home/path_to_reference/nCoV_Jan31.fa.fasta sorted_sample1.sam > sample.mpileup The message returned is…

Continue Reading samtools mpileup error – 1 samples in 1 input files

Different FastQC results after name-sorting BAM file, sequence duplication increases

Different FastQC results after name-sorting BAM file, sequence duplication increases 1 Okay, so what I did might was stupid, but I was determined to examine on my own a lot of things, and experiment a bit with tools. At one point I decided to do this: I had BAM file…

Continue Reading Different FastQC results after name-sorting BAM file, sequence duplication increases

rust-bio-tools 0.35.0 – Docs.rs

rust-bio-tools-0.35.0 is not a library. A set of ultra fast and robust command line utilities for bioinformatics tasks based on Rust-Bio. Rust-Bio-Tools provides a command rbt, which currently supports the following operations: a linear time implementation for fuzzy matching of two vcf/bcf files (rbt vcf-match) a vcf/bcf to txt converter,…

Continue Reading rust-bio-tools 0.35.0 – Docs.rs

Attempting to generate a bam.bai file but the output is not readable

Attempting to generate a bam.bai file but the output is not readable 1 Hi, I am new a exome sequencing, and have tried to follow tutorials on the subject. I am stuck at the samtools index stage because the output files are in a non-human readable format and I believe…

Continue Reading Attempting to generate a bam.bai file but the output is not readable

ChIP-Seq density plot between two groups of genes

ChIP-Seq density plot between two groups of genes 1 Hello I have Chipseq data(.bam file) from this I have normalized my bam file BPM method and created the density plot using deeptools, density plot i have created for two group of genes (Genes of interest) group of genes are different…

Continue Reading ChIP-Seq density plot between two groups of genes

Chipseq density plot between two grops of genes

Chipseq density plot between two grops of genes 1 Hello I have Chipseq data(.bam file) from this I have normalized my bam file BPM method and created the density plot using deeptools, density plot i have created for two group of genes (Genes of interest) group of genes are different…

Continue Reading Chipseq density plot between two grops of genes

snakemake truncating shell codes

snakemake truncating shell codes 0 I’m trying to change the chromosome number notation from [0-9XY] to Chr[0-9XY] using the samtools reheader in the shell command of the snakemake. rule rename: input: os.path.join(config[“input”], “{sample}.bam”), output: os.path.join(config[“output”], “new_sample/{sample}_chr.bam”) log: os.path.join(config[“log”], “samtools/{sample}”) shell: “samtools view -H {input} | sed -e ‘s/SN:([0-9XY]*)/SN:chr1/’ -e ‘s/SN:MT/SN:chrM/’…

Continue Reading snakemake truncating shell codes

Padding out a GVCF file with 1000G exomes to get gatk VariantRecalibrator working with a small sample

I’ve got sequencing data for a small 500 bp amplicon from a few samples. GATK best principles suggest running VariantRecalibrator on the GVCF files I generate. I’m trying to get this working, but I get an error about “Found annotations with zero variances”. Reading the gatk manual and other posts…

Continue Reading Padding out a GVCF file with 1000G exomes to get gatk VariantRecalibrator working with a small sample

computeMatrix in deeptool is Running with no result

computeMatrix in deeptool is Running with no result 0 Hi All, I wonder if someone can help me in explaining what to input on the -R <bed file> argument of the code below? computeMatrix scale-regions -S <bigwig file(s)> -R <bed file> -b 1000 what I did for example, I download…

Continue Reading computeMatrix in deeptool is Running with no result

NoClassDefFoundError: htsjdk/samtools/util/IntervalTree

NoClassDefFoundError: htsjdk/samtools/util/IntervalTree 0 When I run circm6A (github.com/canceromics/circm6a) example code: cd ../.. java -Xmx16g -jar circm6a.jar -ip test_data/HeLa_eluate_rep_1.chr22.bam -input test_data/HeLa_input_rep_1.chr22.bam -r test_data/gencode_chr22.gtf -g test_data/hg38_chr22.fa -o test_data/example_Hela The following error occurred: Start at 2021-12-12 16:33:26 Exception in thread “main” java.lang.NoClassDefFoundError: htsjdk/samtools/util/IntervalTree at main.Method.loadGenes(Method.java:200) at main.Method.run(Method.java:66) at main.Main.main(Main.java:9) Caused by: java.lang.ClassNotFoundException: htsjdk.samtools.util.IntervalTree…

Continue Reading NoClassDefFoundError: htsjdk/samtools/util/IntervalTree

[moiexpositoalonsolab/grenepipe] freebayes causes early error about number of threads

Hi Lucas, got a weird one for you. If I change the caller from hapotypecaller to freebayes, I get the error below. It’s doubly strange because it seems to occur well before freebayes would be used in the pipeline. [Sat Dec 11 11:13:02 2021] rule samtools_stats: input: dedup/111D03-1.bam output: qc/samtools-stats/111D03-1.txt…

Continue Reading [moiexpositoalonsolab/grenepipe] freebayes causes early error about number of threads

Deeptools: computeMatrix can’t read file

Deeptools: computeMatrix can’t read file 1 Hi, I generated bigwig file using bamCoverage with the following code. bamCoverage -b A.bam -o A.bw –binSize 10 -p max –normalizeUsing CPM Then, I tried to use computeMatrix computeMatrix -S A.bw -R B.bw -o C but I got the following error usage: computeMatrix [-h]…

Continue Reading Deeptools: computeMatrix can’t read file

A matrix sample for Profile plots and heatmaps of Computematrix, deepTools

A matrix sample for Profile plots and heatmaps of Computematrix, deepTools 0 Hi everyone, I have a count matrix from feature counts and of course, couple of peak (.bed) files. I want to visualize the peaks all together to show the coverage and overall comparing. I was going to use…

Continue Reading A matrix sample for Profile plots and heatmaps of Computematrix, deepTools

Using STAR SJ.out.tab file to identify novel ncRNAs

Using STAR SJ.out.tab file to identify novel ncRNAs 0 Hi All, I am attempting to identify novel ncRNAs from a circadian RNAseq dataset. Specifically I have a ribo-depleted RNAseq timecourse with 31 samples (sample every 2 hours for 60hrs). I have run STAR (code below). I am trying to follow…

Continue Reading Using STAR SJ.out.tab file to identify novel ncRNAs

Genome Bioinformatics Analyst – Pittsburgh

**Description** UPMC Presbyterian is hiring a Genome Bioinformatics Analyst to join the Molecular and Genomic Pathology Laboratory (MGP) team! This role will work a daylight schedule Monday through Friday. No weekends or holidays are required! The Molecular and Genomic Pathology Laboratory (MGP) is a dynamic state-of-the-art clinical laboratory that prides…

Continue Reading Genome Bioinformatics Analyst – Pittsburgh

htseq-count Error ‘_StepVector_Iterator_obj’ object has no attribute ‘next’

htseq-count Error ‘_StepVector_Iterator_obj’ object has no attribute ‘next’ 0 I am trying to run htseq-count (v. 0.13.5) on a sorted and indexed bam file. The command I entered looks like this: htseq-count -f bam -r pos -s yes -t CDS -i gene_id -m union filename_sorted.bam filename.gtf I get the following…

Continue Reading htseq-count Error ‘_StepVector_Iterator_obj’ object has no attribute ‘next’

Senior Bioinformatics Scientist II/ Staff Bioinformatics Scientist

Inscripta was founded in 2015 and recently launched the world’s first benchtop Digital Genome Engineering platform. The company is growing aggressively, investing in its leadership, team, and technology with a recent $150mm financing round led by Fidelity and TRowe price. The company’s advanced CRISPR-based platform, consisting of an instrument, reagents,…

Continue Reading Senior Bioinformatics Scientist II/ Staff Bioinformatics Scientist

Problem with using flagstat after bowtie2 alignment

I’m running bowtie2 to align multiple samples to one reference genome, and then run samtools flagstats to output the results. All but two samples have aligned and I’ve managed to run flagstat on them. For those two samples, when I run flagstat, I first get: [W::bam_hdr_read] EOF marker is absent….

Continue Reading Problem with using flagstat after bowtie2 alignment

How to call LOH with FreeC

How to call LOH with FreeC 0 Good morning, I am try to infer loss of heterozygosity (LOH) from WGS data using Freec. For this purpose, I am using these parameters in the “[BAF]” section of the configuration file: [BAF] makePileup = My_somaticVCF.vcf.gz fastaFile = hg19.fa SNPfile = hg19_snp142.SingleDiNucl.1based.txt.gz When…

Continue Reading How to call LOH with FreeC

Extracting Number of SNPs via parsing MD tags

Hello all, I’m having a bit of difficulty wrapping my head around a task involving extracting the total number of SNPs from an alignment via creating a string parser/grep command which would be able to extract only the SNPs and ignoring indels. I am currently using a python script utilising…

Continue Reading Extracting Number of SNPs via parsing MD tags

Removing contamination with SNP tools

Removing contamination with SNP tools 0 Hello everyone, Currently, I’m working on a ChIPseq dataset where I will analyze chromatin marks on transposons and genes in a fungus. Unfortunately, I got some contamination in my data from a closely related species. Because they are so similar, removing contamination based on…

Continue Reading Removing contamination with SNP tools

TRON-Bioinformatics/tronflow-bam-preprocessing: Release v1.7.1 | Zenodo

Zenodo DOI Badge DOI 10.5281/zenodo.5768626 Markdown [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5768626.svg)](https://doi.org/10.5281/zenodo.5768626) reStructedText .. image:: zenodo.org/badge/DOI/10.5281/zenodo.5768626.svg :target: doi.org/10.5281/zenodo.5768626 HTML <a href=”https://doi.org/10.5281/zenodo.5768626″><img src=”https://zenodo.org/badge/DOI/10.5281/zenodo.5768626.svg” alt=”DOI”></a> Image URL zenodo.org/badge/DOI/10.5281/zenodo.5768626.svg Target URL doi.org/10.5281/zenodo.5768626 Read more here: Source link

Continue Reading TRON-Bioinformatics/tronflow-bam-preprocessing: Release v1.7.1 | Zenodo

QualiMap Multi-Sample BamQC does not load in center panel – usegalaxy.eu support

Dear @Mario_Garcia,The tool should work, I tested it right now with some test data. Please make sure: (a) your data was correctly analyzed by QualiMap BAM QC (i.e., the files should not be empty),(b) your files come from the same organism and data library(c) and you data has no weird…

Continue Reading QualiMap Multi-Sample BamQC does not load in center panel – usegalaxy.eu support

multiBamSummary for unpaired end read?

Hello everyone, I am using multiBamSummary to computes read coverages for many bam files. multiBamSummary bins –bamfiles *.bam -out readCounts.npz –outRawCounts readCounts.tab However, I got errors as below. May I get help please?? [E::bgzf_read] Read block operation failed with error 2 after 0 of 4 bytes Traceback (most recent call…

Continue Reading multiBamSummary for unpaired end read?

Request to honour POSIX ACLs

Hi folks. Would it be possible to have the access checks on bam files enhanced to check whether or not POSIX ACLs will allow users to access files? HTS.pm line 1387 uses filetest access and, according to the perldoc for filetest …you may encounter surprises if your program runs on…

Continue Reading Request to honour POSIX ACLs

Best software for hQTL analysis with the aim of performing MR analysis

I would like to perform QTL analysis on a list of BAM files that I have containing the locations of histone mark reads. Ultimately my aim is to perform Mendellian randomisation, with these hQTLs as the exposure, against a different phenotype as the outcome. The outcome effects are expressed as…

Continue Reading Best software for hQTL analysis with the aim of performing MR analysis

Create junctions from Bed file for IGV visualization

Create junctions from Bed file for IGV visualization 0 Any advice for creating junctions file from a bed-like file? My bed file looks like this: chr start end chr star end I have tried to copy the format used in TopHat (junctions file). But I can’t see the junctions in…

Continue Reading Create junctions from Bed file for IGV visualization

How to call variant by –max-depth for RNAseq

Hi everyone! I have a query regarding variant calling from a high coverage site on the basis of the maximum likelihood variant. I have RNA-seq data mapped bam file. I called variant using the below command. “bcftools mpileup –max-depth 10000 -Oz -f ref.fa sample.bam | bcftools call -mv -Oz -o…

Continue Reading How to call variant by –max-depth for RNAseq

VCF samtools

VCF samtools 0 Hello, I am having trouble when doing variant calling with samtools. I am getting only the header an no variants. If I would instead use Freebayes, I do get a lot of variables, and with Gatk, I get just a few. What can the problem be? Do…

Continue Reading VCF samtools

How to handle VCFs from the same sample but using different aligners and variant callers?

Hi, I’m using whole-exome sequencing (WES) for somatic variant calling. During the process, I tried to follow the approach described here: pubmed.ncbi.nlm.nih.gov/28420412/ Basically my workflow is as follows: FASTQ preprocessing: Using 2 aligners (BWA-MEM, Bowtie2) BAM calibration Variant calling: Using 3 software (Mutect2, Strelka2, Lancet) Variant filtering: I keep just…

Continue Reading How to handle VCFs from the same sample but using different aligners and variant callers?

Somatic Variant Calling

Somatic Variant Calling 2 Hi, I need to call somatic variants from a BAM file of cancer panel. Can anyone please suggest any suitable tool for calling the variants and generate a VCF file. Thank You BAM NGS Variants Cancer • 53 views “Suitable” is very context-dependent, are you working…

Continue Reading Somatic Variant Calling

Low read consensus with GroupReadsByUmi

Hi everyone, I’m trying to troubleshoot a low read consensus rate in duplex sequencing data – by using GroupReadsByUmi and CallDuplexConsensusReads in the fgbio package. For CallDuplexConsensusReads, I use: java -jar fgbio-1.4.0.jar CallDuplexConsensusReads –input=grouped.bam –output=consensus.unmapped.bam –error-rate-pre-umi=15 –error-rate-post-umi=10 –min-input-base-quality=10 But get: [2021/12/06 17:33:10 | CallDuplexConsensusReads | Info] Raw Reads Filtered Due…

Continue Reading Low read consensus with GroupReadsByUmi

CAR T Cells in R/R DLBCL

Marin F. Xavier, MD: We talked about loncastuximab and LOTIS data, LOTIS-1. There’s 1 more thing that’s out there to be aware of. LOTIS-8 is being tested in newly diagnosed patients, but I haven’t reviewed that yet. Be on the lookout for that to be reported. Of course, when I…

Continue Reading CAR T Cells in R/R DLBCL

Making consensus sequence for each haplotype

Making consensus sequence for each haplotype 0 I’m dealing with paired end amplicon sequencing data. I’ve produced a GVCF file with haplotype calls using: gatk HaplotypeCaller -R $REF -I “$BAM” -O “$OUT”/results/variants/${SN}_HaplotypeCallerPGT.vcf -ERC GVCF The vcf file it produces contains the PGT flag, and variants are called in the format…

Continue Reading Making consensus sequence for each haplotype

multiqc could not generate results

multiqc could not generate results 1 Hello all, I was running MultiQC in my current directory where my bam/fastqc files are located to generate some general statistic but could not get any result(No analysis results found. Cleaning up..) I install the software and run it as follow pip install multiqc…

Continue Reading multiqc could not generate results

featureCounts difference assigned reads summary file and summed up reads in feature count matrix

featureCounts difference assigned reads summary file and summed up reads in feature count matrix 0 Dear all, this might be a naive question but my googlefoo fails me. I count reads from a bam, aligend by Star against a custom hg19 genome, after running picard markDuplicates, then counting reads assigned…

Continue Reading featureCounts difference assigned reads summary file and summed up reads in feature count matrix

state and usuge of compressed file standards better than BAM and FASTQ

Forum:2021: state and usuge of compressed file standards better than BAM and FASTQ 3 Extra compressed formats for raw/aligned reads and variant tables have been around for some time but I think saw slow adoption. Our current disk space usage is making us have another look at switching to file…

Continue Reading state and usuge of compressed file standards better than BAM and FASTQ

how to do basic statistics for bam files

how to do basic statistics for bam files 1 Hi Bistar teams, I have unpaired Exom-seq data. I did the quality control and alignment. Now my files are in bam format and I would like to do some basic statistics like fragment size, coverages, mismatches, Gaps , duplicates etc ….

Continue Reading how to do basic statistics for bam files

Extract human, mouse and pig aligned reads from .bam file alignment to conactenated genomes

Extract human, mouse and pig aligned reads from .bam file alignment to conactenated genomes 0 I aligned an NGS read dataset to a conactenated reference consisting of the human, mouse and pig genomes. Now I would like to extract the human (GRCh38) aligned reads only into a seperate .bam file…

Continue Reading Extract human, mouse and pig aligned reads from .bam file alignment to conactenated genomes

Help with picard

Help with picard 1 Can anyone help me? I have no idea why my Picard not working. I used to make it work but now when I run a pipeline, this picard step can’t work. Thank you so much! (R-4) $ java -jar picard.jar CollectRawWgsMetrics -R /ref/hs37d5.fa -I CPM00002066-PL-D-20191015_20211104-dragen-somatic_tumor.bam -O…

Continue Reading Help with picard

Alignment fastq files

Alignment fastq files 0 I have a question. I need to align and convert fastq files (unpaired) into bam file. If to beo sure I need to ast first if this command below are enough to do this or I forgot about something. bowtie2 -x input.index.hg19 -U input.fastq -S {output}…

Continue Reading Alignment fastq files

How to merge samples of the same cell type to do differential peak calling?

How to merge samples of the same cell type to do differential peak calling? 1 Hi! Now I have 6 ATAC-seq samples which have been aligned to hg19 and called peaks by MACS2. 3 of them are from A cells and the other are from B cells. Now I’m going…

Continue Reading How to merge samples of the same cell type to do differential peak calling?

Error “start too small” when running htseq-count on a sorted .bam file

Error “start too small” when running htseq-count on a sorted .bam file 0 Hello, This is my first time aligning scRNA-seq reads to a reference genome to analyze differential gene expression. I am using htseq-count to obtain count files for my different samples and I am receiving the following error:…

Continue Reading Error “start too small” when running htseq-count on a sorted .bam file

Alignments not labelled as proper pair on bwa mem

Alignments not labelled as proper pair on bwa mem 0 I used bwa mem to perform a paired alignment of two fastq files. The resulting bam file would have been used to generate an mpileup and finally a consensus sequence. I noticed that for this alignment, many paired and seemingly…

Continue Reading Alignments not labelled as proper pair on bwa mem

Run multiple times samtools and sed for a big number of bam files in folder

Run multiple times samtools and sed for a big number of bam files in folder 1 How can we execute the following commands with bash for a big amount of bam files in a folder samtools view -H in.bam > header.sam sed -i s/SN:/SN:chr/ header.sam sed -i s/SN:chrMT/SN:chrM/ header.sam samtools…

Continue Reading Run multiple times samtools and sed for a big number of bam files in folder

Add or reveal read groups on .sam file aligned by BWA

Add or reveal read groups on .sam file aligned by BWA 0 Hi, I’m trying to use GATK HaplotypeCaller but everytime I run its says A USER ERROR has occurred: Argument emit-ref-confidence has a bad value: Can only be used in single sample mode currently. Use the –sample-name argument to…

Continue Reading Add or reveal read groups on .sam file aligned by BWA

how to visually compare BAM file differences

how to visually compare BAM file differences 0 I am a Bioinformatics novice learning workflow of calling somatic mutation . I found actions related to BAM file are these : sort, markdup ,reorder ,indel realignment,BQSR , I want to known the differences of them after I execute one step ….

Continue Reading how to visually compare BAM file differences

How To Open Bam Files Without Software?

The BAM files can be opened remotely (ftp, http) or locally (local). The index file must be found in the same directory as the BAM file in order to view it. The index should be named by appending “. The file name is changed from “bai” to “bam”. How Do…

Continue Reading How To Open Bam Files Without Software?

Mean samples coverage and mapping quality in Qualimap

Mean samples coverage and mapping quality in Qualimap 0 Hello, Can anyone explain what “mean samples coverage” means in Qualimap multi-sample report? Also, what are the acceptable numbers for “mean samples coverage” and “Mean samples mapping quality”? After analyzing some 400 BAM files (from ddRAD data mapped to a scaffold-level…

Continue Reading Mean samples coverage and mapping quality in Qualimap

Insert size historgram from Picard for Illumina paried end 150 bp: FR, TANDEM, and both

I’m got some low coverage skim-seq bam files (1x) and was doing qc on them and got some strange results. I ran Picard CollectInsertSizeMetrics. The sequencing was done by Illumina paired end and the orientation was be F-R as usual. But I got insert size histograms showing FR, TANDEM, and…

Continue Reading Insert size historgram from Picard for Illumina paried end 150 bp: FR, TANDEM, and both

featurecounts in command line

featurecounts in command line 0 I’m trying to convert my bam files to count data with the help of feature counts in command line, I used the code: featurecounts -T 8 -a /Users/ria/Desktop/bowtie_2/GCF_000001405.39_GRCh38.p13_genomic.gtf -g ‘transcrip_id’ -o readcounts/readcount1.txt bam files/-.bam (readcounts is a the directory for dumping the output) the error…

Continue Reading featurecounts in command line

Problem with bowtie2 alignment – libtbb.so.2

In a nutshell, I have 44 folders of different samples/species that each have paired reads for those samples/species. I’m doing bowtie alignment with the same referent genome, and then outputing it to BAM and sorting it using samtools. Since alignment takes a while, I’ve written a script and passed it…

Continue Reading Problem with bowtie2 alignment – libtbb.so.2

Corset error reading input files

Input: corset -g F,F,F,F,G,G,G,G F1.bam F2.bam F3.bam F4.bam G1.bam G2.bam G3.bam G4.bam Output: Running Corset Version 1.09 Setting sample groups:F,F,F,F,G,G,G,G, 2 groups in total The number of experimental groups passed (via the -g option) does not match the number of input files. Please check how many values you have passed….

Continue Reading Corset error reading input files

best practice to design and reuse a process/worfklow

Let’s say I want to genotype a set of BAMs using GATK. A basic DSL2 nextflow workflow would look like: workflow { take: reference beds bams main: hc = haplotypecaller(reference,bams.combine(beds)) bed2vcf = combinegvcf(hc.groupTuple()) vcf = gathervcfs(bed2vcf.collect()) } process haplotypecaller { input: val(reference) tuple val(bam),val(bed) output: tuple bed,path(“sample.g.vcf.gz”) script: “”” gatk…

Continue Reading best practice to design and reuse a process/worfklow

Senior Scientist, Bioinformatics in Baltimore, MD

Summary of Major ResponsibilitiesThe Senior Bioinformatics Scientist will coordinate the work to support product requirements and internal customer demands regarding in-the-cloud complex bioinformatics pipelines, with a focus on cloud-based analysis of next-generation sequencing (NGS) data for biomarker detection and product development.This position will conduct and supervise the implementation of algorithms…

Continue Reading Senior Scientist, Bioinformatics in Baltimore, MD

Error while aligning with STAR

Error while aligning with STAR 1 Hi, when I try to run this command. Note that I change working directory to directory containing fast files STAR –genomeDir /N/slate/ogaafer/mm10_index –runThreadN 8 –readFilesIn SRR8278856_1.fastq,SRR8278856_2.fastq SRR8278857_1.fastq,SRR8278857_2.fastq SRR8278859_1.fastq,SRR8278859_2.fastq –outSAMattrRGline ID:cont1 , ID:cont2 , ID:cont3 –outSAMtype BAM SortedByCoordinate –outSAMunmapped Within –outSAMattributes Standard I get this…

Continue Reading Error while aligning with STAR

How does “bedtools intersect -s” work with paired-end sequences?

How does “bedtools intersect -s” work with paired-end sequences? 1 Hi, I have paired-end lectures from RNAseq experiments and I want to use bedtools intersect with the force “strandedness” parameter (-s). Will bedtools take into account the paired-end reads and treat them as a single event, or will get two…

Continue Reading How does “bedtools intersect -s” work with paired-end sequences?

Problems with consensus fasta

Hi everyone! I’m new in bioinformatics and also in informatics so I’m struggling a bit trying to learn on my own. At the moment I’m having some difficulties trying to generate a fasta file from a bam file of a complete human genome. I’ve red on the internet that a…

Continue Reading Problems with consensus fasta

consensus fasta

Hi everyone! I’m new in bioinformatics and also in informatics so I’m struggling a bit trying to learn on my own. At the moment I’m having some difficulties trying to generate a fasta file from a bam file of a complete human genome. I’ve red on the internet that a…

Continue Reading consensus fasta

Generate a count matrix for a consensus peak set and related peaks/reads

Hi everyone, I have a consensus peak file (.bed) that have, Chr, start, end. It doesn’t have any header. Something like this: chr1 721000 726999 chr1 817800 821799 chr1 1027400 1030799 chr1 1033600 1037599 chr1 1047400 1050399 I want to generate a count matrix for my downstream analysis and differential…

Continue Reading Generate a count matrix for a consensus peak set and related peaks/reads

Genome Size Estimation with Jellyfish and Genome Scope is Unexpectedly Small

I used jellyfish to count kmers and the obtained histograms were analyzed using Genome scope (qb.cshl.edu/genomescope/) in order to estimate effective genome size. However the resulting bp estimation of 61,351 bps is unexpectedly small for human data. The paired-end data was obtained from a Chip-seq experiment on Geo (GSE72141). Additionally,…

Continue Reading Genome Size Estimation with Jellyfish and Genome Scope is Unexpectedly Small

Problem with peak calling using hiddenDomains: entire chr12 lacking peaks.

Hi, I have a problem processing ChiP-Seq data. I have two datasets of H3K9me3 replicates aligned to the reference genome mm39. For peak calling, I used hiddenDomains. About my problem: replicate A shows peaks on all chromosomes after peak calling, replicate C shows no peaks on Chr12. The .bw files…

Continue Reading Problem with peak calling using hiddenDomains: entire chr12 lacking peaks.

bcl-convert and UMI’s

bcl-convert and UMI’s 0 Hi, We’re looking into using bcl-convert as an alternative for Picard wrt “UMI aware” demultiplexing. Picard can demultiplex straight to bam with the UMI in the correct RX tag. bcl-convert appends the UMI sequence to the fastq header line when using the OverrideCycles option. I’m looking…

Continue Reading bcl-convert and UMI’s

Extracting matching reads by read ID

Extracting matching reads by read ID 1 What tool would you recommend to compare two BAM files and extract matching reads by read ID? BAM • 79 views samtools view file1.bam | awk -F “t” ‘{print $1}’ | sort | uniq > names_in_file1 filterbyname.sh -Xmx4g in=file2.bam names=names_in_file1 out=file.fq.gz include=t file.fq.gz…

Continue Reading Extracting matching reads by read ID

Alignment of de novo assembled transcripts to reference transcriptome?

Alignment of de novo assembled transcripts to reference transcriptome? 0 Hi all, I started with 40 samples of raw (but trimmed) RNASeq samples. I used these as inputs for SPAdes-rna and Trinity, to assemble them without a reference. I now have 80 assembled transcriptomes (?), and I tried to follow…

Continue Reading Alignment of de novo assembled transcripts to reference transcriptome?

STAR Genome indexing (Homo_sapiens_assembly38.fasta vs. GRCh38.primary_assembly.genome.fa)

I have a a query regarding STAR alignment. I used the following commands to generate genome index. (Homo_sapiens_assembly38.fasta) STAR –runMode genomeGenerate –genomeDir /home/bsh/BC_MCFcellLine_WTS/result/STAR_indexing/ –genomeFastaFiles /data1/database/ftp.broadinstitute.org/bundle/hg38_210610_download/Homo_sapiens_assembly38.fasta –sjdbGTFfile /home/bsh/BC_MCFcellLine_WTS/gencode.v27.annotation.gtf And I used the following commands for mapping and bam file was successfully generated. STAR –runThreadN 4 –outFilterType BySJout –outFilterMismatchNmax 999 –outFilterMultimapNmax 10…

Continue Reading STAR Genome indexing (Homo_sapiens_assembly38.fasta vs. GRCh38.primary_assembly.genome.fa)