Tag: picard

Info about rf vs fr with samtools or other?

I am working with Illumina GAII derived mate pair data. I am attempting to identify two things in the sequences so that I can get rid of spurious data: 1. Are the reads rf or fr. I know picard must have the ability to identify this info from a sam…

Continue Reading Info about rf vs fr with samtools or other?

iPSCs derived from infertile men carrying complex genetic abnormalities can generate primordial germ-like cells

Patients and controls The patient 1 was 38 years old and consulted for infertility after he and his partner had been trying to conceive for 2 years. The patient was the first child of unrelated parents, and he had four brothers and five sisters whose fertility status could not be determined…

Continue Reading iPSCs derived from infertile men carrying complex genetic abnormalities can generate primordial germ-like cells

PCR duplicates in FFPE RNASeq

PCR duplicates in FFPE RNASeq 0 Dear all, I am working on 100 RNASeq data generated with a stranded protocol and a Novaseq run. I need to perform variant calling on these samples, however I am facing some problem. I have not access to DNA so exome/targeted amplification is not…

Continue Reading PCR duplicates in FFPE RNASeq

Fast way to sort bam file by queryname similar to picard SortSam SORT_ORDER=queryname?

Fast way to sort bam file by queryname similar to picard SortSam SORT_ORDER=queryname? 0 When sorting by queryname with Samtools (samtools sort -n), Samtools does a natural sort by colon-delimited subfield. On the other, when sorty by queryname with Picard (picard SortSam SORT_ORDER=queryname), Picard does not sort by colon-delimited subfield,…

Continue Reading Fast way to sort bam file by queryname similar to picard SortSam SORT_ORDER=queryname?

SMARCE1 deficiency generates a targetable mSWI/SNF dependency in clear cell meningioma

Clapier, C. R., Iwasa, J., Cairns, B. R. & Peterson, C. L. Mechanisms of action and regulation of ATP-dependent chromatin-remodelling complexes. Nat. Rev. Mol. Cell Biol. 18, 407–422 (2017). CAS  PubMed  PubMed Central  Article  Google Scholar  Mashtalir, N. et al. Modular organization and assembly of SWI/SNF family chromatin remodeling complexes….

Continue Reading SMARCE1 deficiency generates a targetable mSWI/SNF dependency in clear cell meningioma

Extract R1 and R2 from sam file generated by bowtie2

Extract R1 and R2 from sam file generated by bowtie2 1 Hi every one How to extract R1 and R2 from sam file generated by bowtie2 ? sam bowtie2 samtools bam • 137 views • link updated 14 hours ago by iraun ★ 4.4k • written 15 hours ago by…

Continue Reading Extract R1 and R2 from sam file generated by bowtie2

linux merge multiple files in picard

Why not use samtools? for folder in my_bam_folders/*; do samtools merge $folder.bam $folder/*.bam done In general, samtools merge can merge all the bam files in a given directory like this: samtools merge merged.bam *.bam EDIT: If samtools isn’t an option and you have to use Picard, what about something like…

Continue Reading linux merge multiple files in picard

a strange pattern of repetitive summits

Problem with the output of Deeptools PlotProfile: a strange pattern of repetitive summits 0 Hi! I am trying to plot DNA binding profiles of my ChIP-seq bw files using Deeptools plotProfile. I generated the matrix using the computeMatrix reference-point. I used some publicly available bed files as my regions of…

Continue Reading a strange pattern of repetitive summits

Detailed differences between sambamba and samtools

3 month , My first post in the new student group , The false-positive mutation appears because duplicates mark Not enough ?, Tells the story of supplementary read It won’t be GATK MarkDuplicates Marked as duplicates The problem of . after , In response to this question , I began…

Continue Reading Detailed differences between sambamba and samtools

A genome-scale screen for synthetic drivers of T cell proliferation

Abramson, J. S. et al. Transcend NHL 001: immunotherapy with the CD19-directed CAR T-cell product JCAR017 results in high complete response rates in relapsed or refractory B-cell non-Hodgkin lymphoma. Blood 128, 4192–4192 (2016). Google Scholar  Shifrut, E. et al. Genome-wide CRISPR screens in primary human T cells reveal key regulators…

Continue Reading A genome-scale screen for synthetic drivers of T cell proliferation

Low transcript quantification with Salmon using GRCm39 annotations

Hi everyone, first time working with mouse samples and unfortunately, there are fewer resources available for the latest mouse Ensembl genome than I was expecting. What I’ve done: I performed rRNA depletion on total RNA extracted from mouse tissue and created Illumina libraries using a cDNA synthesis kit with random…

Continue Reading Low transcript quantification with Salmon using GRCm39 annotations

Population genomics of Escherichia coli in livestock-keeping households across a rapidly developing urban landscape

Karesh, W. B. et al. Ecology of zoonoses: natural and unnatural histories. Lancet 380, 1936–1945 (2012). PubMed  PubMed Central  Google Scholar  Wolfe, N. D., Dunavan, C. P. & Diamond, J. Origins of major human infectious diseases. Nature 447, 279–283 (2007). CAS  PubMed  PubMed Central  Google Scholar  Allen, T. et al….

Continue Reading Population genomics of Escherichia coli in livestock-keeping households across a rapidly developing urban landscape

HRJOB7442 Bioinformatics Scientist 2 (Various Locations) in Nether Alderley, Macclesfield (SK10) | Almac Group (Uk) Ltd

Bioinformatics Scientist 2 Hours: 37.5 hours per week Salary: Competitive Ref No: HRJOB7442 Business Unit: Diagnostic Services Location: Craigavon or Manchester Open To: Internal and External Applicants The Company Almac Diagnostic Services is a leading stratified medicine business, specialising in biomarker-driven clinical trials. We are incredibly proud to be involved…

Continue Reading HRJOB7442 Bioinformatics Scientist 2 (Various Locations) in Nether Alderley, Macclesfield (SK10) | Almac Group (Uk) Ltd

Genomic analysis on Galaxy using Azure CycleCloud

Cloud computing and digital transformation have been powerful enablers for genomics. Genomics is expected to be an exabase-scale big data domain by 2025, posing data acquisition and storage challenges on par with other major generators of big data. Embracing digital transformation offers a practically limitless ability to meet the genomic…

Continue Reading Genomic analysis on Galaxy using Azure CycleCloud

python – Packages Not Found Error: Not available from current channel- Bioconda

Using a Mac with M1 chip, I’m trying to install the following Bioconda packages: cutadapttrim-galoresamtoolsbedtools.htseq.bowtie2.deeptools.macs2 I’ve been able to install picard and fastqc with no issues, but all others turn out one of two error messages: PackagesNotFoundError: The following packages are not available from current channels: or Found conflicts! Looking…

Continue Reading python – Packages Not Found Error: Not available from current channel- Bioconda

java – GATK: HaplotypceCaller IntelPairHmm only detecting 1 thread

I can’t seem to get GATK to recognise the number of available threads. I am running GATK (4.2.4.1) in a conda environment which is part of a nextflow (v20.10.0) pipeline I’m writing. For whatever reason, I cannot get GATK to see there is more than one thread. I’ve tried different…

Continue Reading java – GATK: HaplotypceCaller IntelPairHmm only detecting 1 thread

Efficiently merge two BAM files while retaining reads from only one file in overlapping regions

Efficiently merge two BAM files while retaining reads from only one file in overlapping regions 1 I have a WGS BAM file that is fairly large (>150GB) and a smaller BAM file (<5GB) with reads in a small 10Mbp region. I want to (efficiently) merge the two BAM files while…

Continue Reading Efficiently merge two BAM files while retaining reads from only one file in overlapping regions

sequence alignment – MarkDuplicatesSpark failing with cryptic error message. MarkDuplicates succeeds

[*] I have been trying to follow the GATK Best Practice Workflow for ‘Data pre-processing for variant discovery’ (gatk.broadinstitute.org/hc/en-us/articles/360035535912). This has all been run on Windows Subsystem for Linux 2 on the Bash shell. I started off with FASTQ files from IGSR (www.internationalgenome.org/data-portal) and performed alignment with Bowtie2 (instead of…

Continue Reading sequence alignment – MarkDuplicatesSpark failing with cryptic error message. MarkDuplicates succeeds

is BBMap/Qualimap affected by log4j vulnerability

is BBMap/Qualimap affected by log4j vulnerability 2 no, unless the tools are used as a library in a web server. It’s worth noting picard.jar and abra.jar are affected (even though as Pierre L says, these are unlikely to be attacked on most systems). If you’re responsible for systems, esp web…

Continue Reading is BBMap/Qualimap affected by log4j vulnerability

Padding out a GVCF file with 1000G exomes to get gatk VariantRecalibrator working with a small sample

I’ve got sequencing data for a small 500 bp amplicon from a few samples. GATK best principles suggest running VariantRecalibrator on the GVCF files I generate. I’m trying to get this working, but I get an error about “Found annotations with zero variances”. Reading the gatk manual and other posts…

Continue Reading Padding out a GVCF file with 1000G exomes to get gatk VariantRecalibrator working with a small sample

[moiexpositoalonsolab/grenepipe] freebayes causes early error about number of threads

Hi Lucas, got a weird one for you. If I change the caller from hapotypecaller to freebayes, I get the error below. It’s doubly strange because it seems to occur well before freebayes would be used in the pipeline. [Sat Dec 11 11:13:02 2021] rule samtools_stats: input: dedup/111D03-1.bam output: qc/samtools-stats/111D03-1.txt…

Continue Reading [moiexpositoalonsolab/grenepipe] freebayes causes early error about number of threads

Strange speed up in GATK LeftAlignIndels

Strange speed up in GATK LeftAlignIndels 1 Hi! I noticed a strange thing, I have been running a DNA-seq pipeline like this: reads -> bwa-mem2 -> picard SortSam -> picard MergeSamFiles -> picard MarkDuplicates -> gatk LeftAlignIndels … gatk LeftAlignIndels has always taken around 4 hours to complete with the…

Continue Reading Strange speed up in GATK LeftAlignIndels

Trouble running vcf2bam jvarkit tool

Trouble running vcf2bam jvarkit tool 2 I am trying to use the tool called vcf2bam from jvarkit on a server and I have the following 2 files: GRCh38_latest_genomic.fna – the file is of format FASTQ , and 00-common_all.vcf. I used samtools faidx and also picard CreateSequenceDictionary, but when I try…

Continue Reading Trouble running vcf2bam jvarkit tool

converting Bam to fastq while removing clipping(hard/soft clip bases)

converting Bam to fastq while removing clipping(hard/soft clip bases) 0 Hello, I want to do some analysis and my raw data is paired-end reads fastq files. So far: I used BWA mem to convert them to Sam file then used samtools to convert to BAM file. My next step is…

Continue Reading converting Bam to fastq while removing clipping(hard/soft clip bases)

Liftover nonmodel VCF

Liftover nonmodel VCF 1 Hi all, I have a FASTA genome assembly and a VCF for my (nonmodel) study species. Now I want to liftover the VCF to the Zebra Finch genome (www.ncbi.nlm.nih.gov/assembly/GCF_003957565.1). I’ve found Picard LiftOver GATK and CrossMap, but both require a UCSC chain file, which apparently can…

Continue Reading Liftover nonmodel VCF

Troubleshooting Tips – bcl2fastq creates duplicate reads

Forum:Troubleshooting Tips – bcl2fastq creates duplicate reads 1 Hi, I have seen a few times where bcl2fastq (v2.20) will produce duplicate FASTQ entries in sequencing read IDs, raw sequences, & quality scores. This causes issues with downstreams tools like Picard MarkDuplicates (e.g. Exception in thread “main” htsjdk.samtools.SAMException: Value was put…

Continue Reading Troubleshooting Tips – bcl2fastq creates duplicate reads

Error in merged bam files

Error in merged bam files 0 Hello I am trying to merge unmapped and mapped bam files. I merged the bam files using the picard tool (gatk.broadinstitute.org/hc/en-us/articles/360036883871-MergeBamAlignment-Picard). I checked the merged bam using ValidateSamFile command (gatk.broadinstitute.org/hc/en-us/articles/360036854731-ValidateSamFile-Picard-) and it showed the below errors: Error Type Count ERROR:MATES_ARE_SAME_END 5496 ERROR:MISMATCH_FLAG_MATE_NEG_STRAND 5478 ERROR:MISMATCH_MATE_CIGAR_STRING…

Continue Reading Error in merged bam files

High frequency of an otherwise rare phenotype in a small and isolated tiger population

Significance Small and isolated populations have low genetic variation due to founding bottlenecks and genetic drift. Few empirical studies demonstrate visible phenotypic change associated with drift using genetic data in endangered species. We used genomic analyses of a captive tiger pedigree to identify the genetic basis for a rare trait,…

Continue Reading High frequency of an otherwise rare phenotype in a small and isolated tiger population

Picard CalculateHsMetrics perTargetCoverage for Novaseq bams

Picard CalculateHsMetrics perTargetCoverage for Novaseq bams 0 Hello, I would like to use Picard’s CalculateHsMetrics to calculate per target coverage for Novaseq bam files. It seems that the tool is not able to calculate mean/normalized coverage for Novaseq bams but works well with Hiseq bams. Novaseq bams report quality scores…

Continue Reading Picard CalculateHsMetrics perTargetCoverage for Novaseq bams

allele balance gatk

allele balance gatk 0 Hi, I am trying to calculate allele balance for both the heterozygous (.40 to .60 ) and homozygous base through vcf file. plz let me know how to achieve it through gatk. I tried using FilterVCF(picard) command as follows –I inputFile.vcf –MIN_AB -O outputfile. I would…

Continue Reading allele balance gatk

Paired-end reads reported without mates: how to play matchmaker?

Hi Everyone, I am currently looking at Acute Myeloid Leukemia (AML) paired-end WGS samples from the TARGET data ocg.cancer.gov/programs/target/target-methods#3241. A bioinformatician in our group remapped the samples from hg19 to hg38. Unfortunately, we do not have any copies of the hg19 version anymore. However, when I try to run anything…

Continue Reading Paired-end reads reported without mates: how to play matchmaker?

Fastqc user manual – vodosp.ru

FASTQ format – Wikipedia 06 September 2021 – by TC Collin · 2020 · Cited by 3 — Be accompanied by a step-by-step user-friendly manual, If the user performs FastQC prior to the removal of adapters (step 3), the length Both programs can be used on Linux/MacOS X machines and quite…

Continue Reading Fastqc user manual – vodosp.ru

Twist Bioscience hiring Bioinformatics Scientist, Production Bioinformatics in South San Francisco, California, United States

Twist is looking for a Bioinformatics Scientist to join our Production Bioinformatics Team. You will work alongside research scientists, software engineers and data scientists to further deliver on our mission to expand access to best-in-class synthetic biology and next-generation sequencing applications. You will be developing and engineering tools to better…

Continue Reading Twist Bioscience hiring Bioinformatics Scientist, Production Bioinformatics in South San Francisco, California, United States

Snakemake-Aligment using BWA-MEM2

Hello I have started using snakemake 6.5.2 to align fastq files with reference file. I have pasted the error below in this question. How to allocate memory in the snakefile and read the header from samfile, ‘-‘. This is the snakefile (wrapper for running alignment): rule bwa_mem2_mem: input: reads=[“/scicore/home/cichon/GROUP/test_workflow/samples/{sample}.1.fq”, “/scicore/home/cichon/GROUP/test_workflow/samples/{sample}.2.fq”]…

Continue Reading Snakemake-Aligment using BWA-MEM2

Mapping reads and quantifying genes

Mapping reads and quantifying genes – Metagenomic workshop 0 Hello, I am using the following metagenomic workshop tutorial to analyse my own metagenomic data. metagenomics-workshop.readthedocs.io/en/latest/annotation/quantification.html I performed the following steps: mapped reads with bowtie2 and generated .bam file with samtools sort. Removed duplicates with picard Extracted gene information from prokka…

Continue Reading Mapping reads and quantifying genes

Can I sort my bam files with Picard MergeSamFiles?

Can I sort my bam files with Picard MergeSamFiles? 0 Hi! I noticed this in the picard MergeSamFiles help: –SORT_ORDER,-SO:SortOrder Sort order of output file Default value: coordinate. Possible values: {unsorted, queryname, coordinate, duplicate, unknown} Does this mean that it is unnecessary to use picard SortSam before? can MergeSamFiles do…

Continue Reading Can I sort my bam files with Picard MergeSamFiles?

Missing read group in BAM files

Missing read group in BAM files 1 Hello everyone, I have processed PE reads through the pipeline HybPiper to align them to a reference genome with GATK. But inspecting the output BAM files with the GATK tool ValidateSamFile, I found out a very common error in the error report: WARNING::RECORD_MISSING_READ_GROUP…

Continue Reading Missing read group in BAM files

Looking for a tool which provides mapping quality score distributions from BAM files

Looking for a tool which provides mapping quality score distributions from BAM files 0 Hello BioStars, Is there a tool which generates mapping quality score distributions from bam files? I know I could potentially do this myself, but I am looking for something which would essentially do the work for…

Continue Reading Looking for a tool which provides mapping quality score distributions from BAM files

So many variants detected.

So many variants detected. 0 Dear All, I have done variant calling in Germline data that has single sample of each individual and two genes. I did following steps, but after checking results I found too many variants. After Haplotypecaller (the step 6) I found 140900 known variants, and the…

Continue Reading So many variants detected.

CROP-seq data analysis

CROP-seq data analysis 1 Hi, I am a new bie to single cell sequencing analysis. I have to analyze CROP-seq data, I am going through the following paper, www.nature.com/articles/nmeth.4177. I have to use cell ranger ( instead of DROP-seq software) as the first step to process single cell data.I wanted…

Continue Reading CROP-seq data analysis