Tag: mpileup

A Benchmark of Genetic Variant Calling Pipelines Using Metagenomic Short-Read Sequencing

Introduction Short-read metagenomic sequencing is the technique most widely used to explore the natural habitat of millions of bacteria. In comparison with 16S rRNA sequencing, shotgun metagenomic sequencing (MGS) provides sequence information of the whole genomes, which can be used to identify different genes present in an individual bacterium and…

Continue Reading A Benchmark of Genetic Variant Calling Pipelines Using Metagenomic Short-Read Sequencing

keep read association with mpileup

Hey, is there a way to make all bases that are from the same read in one vertical line using mpileup? So making this chr2 96 C 5 .,.,, chr2 97 A 4 .,., chr2 98 C 6 .,.,,, chr2 99 C 8 .,.,,,.. chr2 00 A 9 .,,,..,,. chr2…

Continue Reading keep read association with mpileup

Ribo-Seq Samtools pileup

Ribo-Seq Samtools pileup 0 I have the results of samtools mpileup. ref|NC_001133| 32065 A 17 …..^”.^”.^”.^”.^!.^”.^”.^”.^”.^”.^”.^”. ref|NC_001133| 32066 G 18 ……………… C@1C.1CCCCCCCCCCCC ref|NC_001133| 32067 A 22 …………………. CC98?C91?CCC;;CCCC;C;C ref|NC_001133| 32068 T 21 ………………… CCCCCCCCCCCCCCCCCCCCC What I want to do is use the gene start and stop position in the reference…

Continue Reading Ribo-Seq Samtools pileup

Randomized phase II study of preoperative afatinib in untreated head and neck cancers: predictive and pharmacodynamic biomarkers of activity

Study objectives and endpoints The main objective consisted in identifying predictive biomarkers of efficacy by exploring correlation between baseline potential biomarkers and radiological and metabolic responses to afatinib. Secondary objectives were to identify potential pharmacodynamic biomarkers, to evaluate the efficacy and safety of afatinib and to assess the metabolic and…

Continue Reading Randomized phase II study of preoperative afatinib in untreated head and neck cancers: predictive and pharmacodynamic biomarkers of activity

Phenotypic drug-susceptibility profiles and genetic analysis based on whole-genome sequencing of Mycobacterium avium complex isolates in Thailand

Abstract Mycobacterium avium complex (MAC) infections are a significant clinical challenge. Determining drug-susceptibility profiles and the genetic basis of drug resistance is crucial for guiding effective treatment strategies. This study aimed to determine the drug-susceptibility profiles of MAC clinical isolates and to investigate the genetic basis conferring drug resistance using…

Continue Reading Phenotypic drug-susceptibility profiles and genetic analysis based on whole-genome sequencing of Mycobacterium avium complex isolates in Thailand

SNP calling with many samples using bcftools

SNP calling with many samples using bcftools 0 Hello, I aim to identify SNPs from approximately 500 BAM files (non-human). I’m opting for bcftools since GATK, even with the Spark addition, takes a substantial 6 hours per sample. My objective is to generate a single VCF file encompassing all SNPs…

Continue Reading SNP calling with many samples using bcftools

format error, unexpected A at line 1

bcftools mipileup error: format error, unexpected A at line 1 0 I had a problem using bcftools. After using the command line(below): there is some error in my results. The error message stated: “Note: none of –samples-file, –ploidy or –ploidy-file given, assuming all sites are diploid [E::fai_build_core] Format error, unexpected…

Continue Reading format error, unexpected A at line 1

remove low coverage mpileup file

remove low coverage mpileup file 0 Hi all, I am working with an mpileup file (generated via samtools) composed of pool-seq WGS of multiple populations. I want to remove all positions with read coverage <5 across any population. Mpileup files are formatted so that the second column refers to the…

Continue Reading remove low coverage mpileup file

Read depth mpileup and column id

Hi, I am working with an mpileup file generated via Popoolation2. All my data is paired-end pooled WGS derived from 24 unique populations. I want to filter my file to only include positions with a read depth of >5. Is the read depth represented only once for all populations in…

Continue Reading Read depth mpileup and column id

NGS one-liner to call variants

Tutorial:NGS one-liner to call variants 0 This is a tutorial about creating a pipeline for sequence analysis in a single line. It is made for capture/amplicon short read sequencing in mind for human DNA and tested with reference exome sequencing data described here. I share the process and debuging steps…

Continue Reading NGS one-liner to call variants

NGS oneliner

Tutorial:NGS oneliner 0 This is a tutorial about creating a pipeline for sequence analysis in a single line.I share the process and debuging steps gone through while putting it together.Source is available at: github.com/barslmn/ngsoneliner/I couldn’t make a longer post, complete version of this post: omics.sbs/blog/NGSoneliner/NGSoneliner.html Pipeline # fastp –in1 “$R1″…

Continue Reading NGS oneliner

Most sensible way to find private SNPs from a multisamples vcf with bcftools

Hello, this question is somehow complementary to what I asked yesterday here: Using bcftools to find unique alt homozygous sites Now let’s say I want to find the SNPs 0/1 unique to the sample D3A350g_bcftools2 (see below) I know I can use bcftools view -s D3A350g_bcftools2.bcf -x all_bcftools2_merged.vcf But there…

Continue Reading Most sensible way to find private SNPs from a multisamples vcf with bcftools

Using bcftools to find unique alt homozygous sites

Hello, I have a vcf with 20 samples. I want to find for each sample the sites that are 1/1, only in that sample (so other samples must have genotypes 0/1 or 0/0). I know I can use filters such as GT=”aa”‘ However, how do I say GT=”aa” for sample…

Continue Reading Using bcftools to find unique alt homozygous sites

Targeted knockout of a conserved plant mitochondrial gene by genome editing

Plant material and growth conditions Nicotiana tabacum cultivar Petit Havana was used for all experiments. The TALEN design and the TALEN-expressing line Nt-JF1006-30 were described previously19. For plant growth under sterile conditions, surface-sterilized seeds were germinated on Murashige and Skoog (MS) medium52 consisting of premixed MS salts and modified vitamins…

Continue Reading Targeted knockout of a conserved plant mitochondrial gene by genome editing

Solved We are now going to call variants with two different

We are now going to call variants with two different approaches from the files we have been working with all course. Please use the following files, parameters, and listed versions of the software for this assignment. We will use the reference Ebola genome: /data/compres/refs/AF086833.2.fasta And this set of paired-end sequences:…

Continue Reading Solved We are now going to call variants with two different

Samtools mpileup for RNA Editing levels generating empty output file

Samtools mpileup for RNA Editing levels generating empty output file 0 I am working on a bash script that takes in bam files (respective paths listed out as required by mpileup documentation), a reference fasta (that bams were aligned to) and a tab separated “known sites” tsv that contains known…

Continue Reading Samtools mpileup for RNA Editing levels generating empty output file

i don’t know error UCSC hg38.fa reference

i commend in  sequenza-utils bam2seqz -p –normal ${RESULTS}/5_variant_calling/${sample}_N.mpileup –tumor ${RESULTS}/5_variant_calling/${sample}_T.mpileup –fasta ${REFER}/hg38.fa -gc ${REFER}/hg38_genome_gc50.wig.gz -o ${RESULTS}/8_seq/${sample}_seqz.gz sequenza-utils seqz_binning  –seqz ${RESULTS}/8_seq/${sample}_seqz.gz -w 50 -o ${RESULTS}/8_seq/${sample}_small_seqz.gz results in dictory chromosome_depth.pdf, gd_plot.pdf, sequenza_extract.RData  chromosome_depth.pdf chromosome chr 1,10,11, chr11_ KI270721v1_random… Why is it in the result file chr 2 ~xy  ? Read more…

Continue Reading i don’t know error UCSC hg38.fa reference

mRNA vaccine quality analysis using RNA sequencing

Design and synthesis of reference plasmid A reference construct was first designed, with the intention of optimising the production of RNA therapeutics for pre-clinical research. The coding sequence of eGFP30 was selected as a reporter in the coding region, as its protein product can be assayed simply through Flow cytometry…

Continue Reading mRNA vaccine quality analysis using RNA sequencing

sarek: Introduction

Introduction nf-core/sarek is a workflow designed to detect variants on whole genome or targeted sequencing data. Initially designed for Human, and Mouse, it can work on any species with a reference genome. Sarek can also handle tumour / normal pairs and could include additional relapses. The pipeline is built using…

Continue Reading sarek: Introduction

How to generate a consensus sequence from BAM file with bcftools?

How to generate a consensus sequence from BAM file with bcftools? 0 Hello, I have aligned some files against a reference genome with BMA-MEM, deduplicated and sorted with sambamba to generate a AlnSrtDedSrt.bam file. The question is: how can I generate a consensus fasta file? I have this fragment of…

Continue Reading How to generate a consensus sequence from BAM file with bcftools?

Problem while working with sequenza

Problem while working with sequenza – Chromosomes out of order 1 Hi, I’m trying to work with sequenza in order to calculate HRD score of a sample using WES data. When I run sequenza, I get a message saying that “chromosomes are out of order”, and I don’t know how…

Continue Reading Problem while working with sequenza

Error using sequenza-utils with WES

Hi! I’m trying to calculate the homologous recombination deficiency score of a cell line (MDA-MB-231) using whole exome sequencing data. To do this, I pretend to use the scarHRD package in R, but I first need a “*.seqz.gz” archive, which is made using sequenza-utils, but I have the following error…

Continue Reading Error using sequenza-utils with WES

Multiparameter prediction of myeloid neoplasia risk

Data acquisition UKB is a large-scale biomedical database and research resource containing genetic, lifestyle and health information from half a million UK participants. UKB has approval from the North West Multicentre Research Ethics Committee (11/NW/0382) and all participants provided written informed consent. The present study has been conducted under approved…

Continue Reading Multiparameter prediction of myeloid neoplasia risk

Create a reference genome from aligned bam file

So, the bam file you have is an alignment from fastq files to a reference genome. And I’m also assuming you did it yourself and have access to the files, so you can test out other alignment softwares. A note is that these commands are after the alignment using bowtie2,…

Continue Reading Create a reference genome from aligned bam file

How does bcftools decide what sample name to assign when calling variants?

How does bcftools decide what sample name to assign when calling variants? 1 How does bcftools decide what sample names to assign in the vcf when performing variant calling using mpileup and call commands? I’m using bcftools to call variants from an aligned bam file like this samtools mpileup -A…

Continue Reading How does bcftools decide what sample name to assign when calling variants?

Clarification for bcftools consensus

Clarification for bcftools consensus 1 Hi all, I have a .vcf file I generated using bcftools mpileup and then filtered to retain positions of interest. I now want to generate a consensus sequence using bcftools consensus. The issue I’ve had so far is that positions in the .vcf file that…

Continue Reading Clarification for bcftools consensus

Upcycling rice yield trial data using a weather-driven crop growth model

Phenotype data We obtained yield datasets for rice (Oryza sativa L.) from 207,331 trials with 8524 cultivars during the 38 years from 1980 to 2017. The data were obtained from field trials at 110 public agricultural experimental stations in Japan conducted by the Institute of Crop Science of the National…

Continue Reading Upcycling rice yield trial data using a weather-driven crop growth model

error reading from input file

Samtools mpileup : error reading from input file 0 I am writing a code to analyse yeast sequencing data using R. I was able to do the alignment using BWA and thus obtain a .bam file and its .bai index using samtools. My next step is to perform variant calling…

Continue Reading error reading from input file

Too high number of SNPs using ddRAD data (36 cattle)

Too high number of SNPs using ddRAD data (36 cattle) 0 Hello I have ddRAD data of 36 cattle that I made a vcf file out of using samtools sort, bcftools mpileup and call. The number of SNPs I am getting is very high (in millions). I referred to other…

Continue Reading Too high number of SNPs using ddRAD data (36 cattle)

AD sum about half of DP in vcf file

AD sum about half of DP in vcf file 0 In my vcf file, the sum of total allelic depth (AD) is about half of the raw read depth (DP) for all SNPs.Is it common? If not, any thoughts on what might be the reason(s) behind this? I use a…

Continue Reading AD sum about half of DP in vcf file

Use bbmap reformat.sh to convert from paired fq files to a bam file

Use bbmap reformat.sh to convert from paired fq files to a bam file 1 As outlined here I was able to create paired-end fastq files with the help of GenoMax . Now I wonder how I can use reformat.sh from bbmap to convert this files to a valid bam file…

Continue Reading Use bbmap reformat.sh to convert from paired fq files to a bam file

bcftools mpileup error

bcftools mpileup error 0 Hi, I’m trying to submit a variant calling job using bcftools mpileup into the HPC server like: bsub -q prod -P gfap-anno-test -J gfap-anno-test -R “rusage[mem=9000,scr=5000] span[hosts=1]” -n 5 bcftools mpileup -Ou -f sample_dedupl.bam | bcftools call -mv -Ob -o out_sample.bcf It throws an error –…

Continue Reading bcftools mpileup error

Use bbmap reformat.sh to convert from paired fq files to a valid abam file

Use bbmap reformat.sh to convert from paired fq files to a valid abam file 1 As outlined here I was able to create paired-end fastq files with the help of GenoMax . Now I wonder how I can use reformat.sh from bbmap to convert this files to a valid bam…

Continue Reading Use bbmap reformat.sh to convert from paired fq files to a valid abam file

Generating consensus sequence from bam file

samtools mpileup -uf reference.fasta sorted_aligned_reads.bam | bcftools call -c | vcfutils.pl vcf2fq > consensus.fastq Here’s a breakdown of the command: samtools mpileup: Generates a pileup of aligned reads at each position in the reference genome. -u: Output in uncompressed BAM format. -f reference.fasta: Specifies the reference genome in FASTA format….

Continue Reading Generating consensus sequence from bam file

The .bcf file generated by bcftools mpileup cannot be opened with BCFtools

The .bcf file generated by bcftools mpileup cannot be opened with BCFtools 1 Hello everyone! I’m encountering an issue where both “bcftools view” and “bcftools call” are unable to open the .bcf file when using the following code. Does anyone have any suggestions or advice? Thank you! My codes: bcftools…

Continue Reading The .bcf file generated by bcftools mpileup cannot be opened with BCFtools

How to extract read counts at the mutation locations

I have a scDNAseq dataset having multiple FASTQ files for multiple single cells. samtools was used after aligning FASTQ files with BWA to hg19 reference to produce bam files. I have already identified 36 SNV mutation sites and I want to use mpileup to extract read counts (Total read count…

Continue Reading How to extract read counts at the mutation locations

ftbfs and test failure against htslib 1.17

Source: samtools Version: 1.16.1-1 Severity: important Tags: ftbfs Hi, When samtools is tested against htslib 1.17 now available in experimental, I witness the following error, either from build time checks or from autopkgtest: The command failed [256]: /tmp/autopkgtest.PsRbbX/autopkgtest_tmp/samtools view -e ‘pos<1000||pos>1200’ -O cram,embed_ref=1 -T test/dat/mpileup.ref.fa -o /tmp/autopkgtest.PsRbbX/autopkgtest_tmp/test/reference/mpileup.1.tmp.cram test/dat/mpileup.1.sam out: err:[E::validate_md5]…

Continue Reading ftbfs and test failure against htslib 1.17

Yersinia pestis genomes reveal plague in Britain 4000 years ago

All radiocarbon dates were calibrated in OxCal 4.4 using the IntCal20 calibration curve18,19. There is no stable carbon and nitrogen isotopic evidence for any detectable input of marine or freshwater foods that would require a correction for reservoir effects. Charterhouse Warren: Archaeological context Charterhouse Warren is a natural shaft in…

Continue Reading Yersinia pestis genomes reveal plague in Britain 4000 years ago

Phylogenomic analysis supports Mycobacterium tuberculosis transmission between humans and elephants

1. Introduction Tuberculosis (TB) is a significant global burden and is widely reported to be a major public health and economic problem, costing the world $617 billion between 2000 and 2015 and projected to cost $1 trillion between 2015 and 2030 (1). It is the second leading cause of death…

Continue Reading Phylogenomic analysis supports Mycobacterium tuberculosis transmission between humans and elephants

The wheat stem rust resistance gene Sr43 encodes an unusual protein kinase

Mutant collection development We mutagenized 2,700 seeds of the wheat–Th. elongatum introgression line RWG34 containing Sr43 (ref. 29). Dry seeds were incubated for 16 h with 200 ml of a 0.8% (w/v) EMS solution with constant shaking on a Roller Mixer (Model SRT1, Stuart Scientific) to ensure maximum homogenous exposure of the…

Continue Reading The wheat stem rust resistance gene Sr43 encodes an unusual protein kinase

An unusual tandem kinase fusion protein confers leaf rust resistance in wheat

Plant material Bread wheat accessions Transfer (TA5524), WL711, TA5605, Ae. umbellulata accession TA1851 and Ae. triuncialis accession TA10438 were obtained from the Wheat Genetics Resource Center (WGRC). TcLr9 (Transfer/6*Thatcher) is a near-isogenic line carrying Lr9 from Transfer in the genetic background of the susceptible wheat line Thatcher. TcLr9 and TA5605…

Continue Reading An unusual tandem kinase fusion protein confers leaf rust resistance in wheat

bcftools get allele abundance

I’m using bcftools to extract variants from a bam file, but I have reference data that tells me whether the patient is homozygous or heterozygous. For a particular sample, I see a high proportion of the alternate allele (87%) and a lower proportion of the reference allele (13%), yet according…

Continue Reading bcftools get allele abundance

Detect mutations in clonally propagated plants

Detect mutations in clonally propagated plants 0 I am analysing Illumina whole-genome resequencing data from two clonally propagated plants aiming to find any potential variants that are unique to either of the two. Note that these would be expected to be somatic mutations and that due to the nature of…

Continue Reading Detect mutations in clonally propagated plants

Variant calling using samtools

Variant calling using samtools 1 Hi all, I modified my output as a vcf file but not bcf as the instruction. Is that OK or the output is not correct? Thank you so much! bcftools mpileup -f Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa finalBamFile2.bam | bcftools call -mv -Ob -o variants.vcf.gz bcftools mpileup -f reference.fa…

Continue Reading Variant calling using samtools

Nextflow process for samtools sort and variant calling

Nextflow process for samtools sort and variant calling 2 Hi everybody ! I’m currently try to learn nextflow scripting. I’m a novice and I just begin with a script in order to sort and convert a SAM file to a BAM file and then call variants thanks to a reference…

Continue Reading Nextflow process for samtools sort and variant calling

ngs – calculate mismatch frequency/rate from a BAM file

I am not using any variant caller Use a variant caller. bcftools, at least, will output variant frequency per variant: bcftools mpileup -ugf ref.fa sample.bam | bcftools call -mv > output.vcf As an alternative approach, I wrote some code to parse the mpileup output and extract counts for different alleles;…

Continue Reading ngs – calculate mismatch frequency/rate from a BAM file

mpileup read base output

mpileup read base output 0 I am running mpileup and generally understand the output of the column containing the bases found at a certain position. However, this is added onto the end of that column. ^\’.^I.^I.^I.^W.^.^R.^I.^!.^I.^E.^\’.^I.^-.^!.^I.^I.^!.^].^!.^I.^8.^I.^I.^I.^6.^I.^I.^I.^!.^I.^I.^$.^!.^!.^].^?.^!.^,.^!.^\’.^I.^!.^\.^I.^\.^\’.^I.^9.^I.^!.^9.^J.^I.^..^I.^I.^I.^I.^I.^I.^T.^I.^\.^I.^E.^!.^”.^!.^\.^9.^!.^I.^!.^E.^I.^M.^I.^I.^I.^.^.^!.^.^\’.^!.^R.^I.^].^].^I.^3.^I.^”.^I.^I.^9.^I.^I.^-.^I.^I.^(.^!.^!.^I.^!.^!.^E.^I.^I.^!.^!.^Q.^I.^I.^I. I’m not sure what this means and can’t seem to find anything in the…

Continue Reading mpileup read base output

Difference between vcf2fq and bcftools consensus

Difference between vcf2fq and bcftools consensus 1 Hi everybody, I’m working for generate consensus sequence and I’m interogating myself about the differences of work process between bcftools consensus and vcf2fq from vcfutils.pl. In the one hand, thanks to vcf2fq, I can generate consensus sequence with the following command: bcftools mpileup…

Continue Reading Difference between vcf2fq and bcftools consensus

Change sequence ID in fastq file generated by bcftools mpileup

Change sequence ID in fastq file generated by bcftools mpileup 0 Hi everobody ! I’m currently work on a HHV8 genetic study and I face to an issue with my bcftools command. Indeed, I want to generate consensus sequences thanks bcftools mpileup command and bam files. However, all ID get…

Continue Reading Change sequence ID in fastq file generated by bcftools mpileup

How to force bcftools to call all variants

How to force bcftools to call all variants 1 Hello I am using bcftools to call variants with this command: bcftools mpileup -Ou -b bamlist -f ref.fasta | bcftools call -Ob -mv >variant.bcf However, for some specific variants that I know to exist (looking at bam files with IGV), I…

Continue Reading How to force bcftools to call all variants

How do I call all the variants with BCFTOOLS?

How do I call all the variants with BCFTOOLS? 0 Hello I am using bcftools to call variants with this command: bcftools mpileup -B -q30 -Q30 -f reference.fasta -a FORMAT/DP,FORMAT/AD –threads 6 -R list_of_specific_position.txt file.bam | bcftools call -m -f GQ -O v -o call_variant.vcf For some specific variants that…

Continue Reading How do I call all the variants with BCFTOOLS?

How can I use bcftools mpileup or an alternative to find ALL variants without any probabilistic inference?

How can I use bcftools mpileup or an alternative to find ALL variants without any probabilistic inference? 0 Hello! I have a pipeline for a maximum depth sequencing project. Briefly, this means I can ignore PCR errors because I check for consensus of UMI-tagged sequences. Therefore, once I have a…

Continue Reading How can I use bcftools mpileup or an alternative to find ALL variants without any probabilistic inference?

Bcftools consensus generates mismatched consensus sequence

Hi everyone, Recently, I have been using bcftools consensus to generate consensus sequence as the following commands: bcftools mpileup -Ou -f ref.fa in.bam | bcftools call -Ou -mv –ploidy 1 | bcftools norm -f ref.fa -Oz -o norm.vcf.gz bcftools index norm.vcf.gz bcftools consensus -f ref.fa -o consensus.fa norm.vcf.gz However, the…

Continue Reading Bcftools consensus generates mismatched consensus sequence

BCFtools for somatic vs. germline variant calling

BCFtools for somatic vs. germline variant calling 0 Hi there, I have seen the workflow ‘mpileup > call’ using BCFtools discussed in the context of both germline and somatic variant calling. It’s not clear to me, then, how the program differentiates between the two. If I’m seeking to identify strictly…

Continue Reading BCFtools for somatic vs. germline variant calling

how to get to a VCF from bam files

how to get to a VCF from bam files 0 Hello, My situation is as follows: I have two groups of reads/Individuals that differ in terms of indels (one group has the indels, the other doesn’t). I already Mapped them and generated bam files. So, now I am struggling to…

Continue Reading how to get to a VCF from bam files

Problem generating a .vcf after upgrade of samtools and bcftools

Problem generating a .vcf after upgrade of samtools and bcftools 1 Hi I used to go over candidate sites of variation using SAMtools mpileup after which I used to execute some evaluations of the data using BCFtools. In general I used to provide the reference fasta genome and use the…

Continue Reading Problem generating a .vcf after upgrade of samtools and bcftools

mpileup2sync

mpileup2sync 0 Hello there, I’m new to doing this type of analysis. I’m trying to convert a mpileup file into the synchronized file format (sync) but I have a problem using the script that I found. This is the script: mpileup2sync –input pools_all.mpileup –output pools_all.sync –fastq-type sanger –min-qual 20 –threads…

Continue Reading mpileup2sync

samtools mpileup – bases string explanation

samtools mpileup – bases string explanation 0 Hi, I am trying to understand the samtools mpileup bases string output and I am having problems with: ^ (caret) marks the start of a read segment and the ASCII of the character following `^’ minus 33 gives the mapping quality $ (dollar)…

Continue Reading samtools mpileup – bases string explanation

Calling with samtools mpileup calls fewer SNPs than expected.

Calling with samtools mpileup calls fewer SNPs than expected. 0 Hello, I’m master course student, and I’m embarrased that I’m very poor at controlling bam files and samtools I tried to variant_calling with samtools mpileup, but as a result, I got a few SNPs (fewer SNPs than expected) As you…

Continue Reading Calling with samtools mpileup calls fewer SNPs than expected.

Genome- and transcriptome-wide splicing associations with alcohol use disorder

Samples RNA-seq We used the same publicly available data source of human post-mortem brain samples as Van Booven et al.7, which were collected from the New South Wales Brain Tissue Resource Center. Van Booven et al.7 also performed differential splicing, but they used different methods, included individuals from disparate ancestral…

Continue Reading Genome- and transcriptome-wide splicing associations with alcohol use disorder

Chromosome “whole genome shotgun sequence” not found

Chromosome “whole genome shotgun sequence” not found 2 Hello everyone, I hope that you´re okay Today I’m trying to do an analysis with population 2 but before I have to bind my .bam files using mpileup. The problem is that when I try to bind the .bam files using mpileul…

Continue Reading Chromosome “whole genome shotgun sequence” not found

reference for freebayes or samtools mpileup after extracting chromosome from alignment

reference for freebayes or samtools mpileup after extracting chromosome from alignment 1 Good evening, I have extracted one chromosome from alignment map (.bam), using samtools view: samtools view -b map.bam chr1 > map_chr1.bam Now I would like to perform SNP calling using freebayes. Is it correct to use chr1.fasta as…

Continue Reading reference for freebayes or samtools mpileup after extracting chromosome from alignment

samtools 1.14 mpileup excludes duplicates

New issue Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community. By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails. Already…

Continue Reading samtools 1.14 mpileup excludes duplicates

Wildcard error in Snakemake – clarification on inputs

Error: Building DAG of jobs… WildcardErrorin line 502 of /path/to/pipeline/workflow/Snakefile.py: Wildcards in input files cannot be determined from output files: ‘anc_r’ Code: import os import json from datetime import datetime from glob import iglob # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Define Constants ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # # discover input files using path from run config SAMPLES…

Continue Reading Wildcard error in Snakemake – clarification on inputs

reference for samtools mpileup after extracting chromosome from alignment

reference for samtools mpileup after extracting chromosome from alignment 0 Good evening, I have extracted one chromosome from alignment map (.bam), using samtools view: samtools view -b map.bam chr1 > map_chr1.bam Now I would like to prepare file for SNP calling using samtools mpileup. Is it correct to use chr1.fasta…

Continue Reading reference for samtools mpileup after extracting chromosome from alignment

No genotype likelihoods when doing SNP calling using bcftools

Hello everyone, I am trying to get genotype likelihoods using bcftools. I am using bcftools version 1.11, running bcftools mpileup and bcftools call. This is what I run: bcftools mpileup -d 8000 -Ou -f $reference $input | bcftools call -mv -Ob -o $variants However, when I check the columns INFO…

Continue Reading No genotype likelihoods when doing SNP calling using bcftools

Mitigating reference bias in genotype calls

Hello, I am working with whole genome resequencing data from non-reference organisms. I am working with low-to-medium depth data (8X-20X) and as expected, there is a bias towards the reference allele during mapping and genotype calling. I have now encountered several scenarios where this bias overwhelms biological signal and misleads…

Continue Reading Mitigating reference bias in genotype calls

Illumina Novaseq 6000 base quality values

How does one interpret the quality score in the FASTQ (or BAM) results coming out from the Illumina Novaseq 6000 Sequencer and DRAGEN pipeline. Any ideas or pointers? Occur ASCII ASC-to-Num PHRED Q value? 82 * (42-33) or 9 Q10? Q0? 65 5 20 152 7 22 37377 : (58-33)…

Continue Reading Illumina Novaseq 6000 base quality values

How can I creating multisample VCF file ?

How can I creating multisample VCF file ? 0 Hello, I want to create a multisample VCF file. I have bam files from various alignment. I ran the command bcftools mpileup -d 100000 -f ~reference *.bam | bcftools call -c > concate.vcf Is this step scientifically correct? Or do I…

Continue Reading How can I creating multisample VCF file ?

Range-wide whole-genome resequencing of the brown bear reveals drivers of intraspecies divergence

Sample collection We obtained the short read sequences for 33 brown bear genomes, four polar bears (Ursus maritimus) and two American black bears (Ursus americanus), publicly available from NCBI’s SRA repository (Table S1 and Fig. 1a)12,13,15,16,40,51,65. Next, we selected from our private collections a total of 95 additional samples for sequencing, among…

Continue Reading Range-wide whole-genome resequencing of the brown bear reveals drivers of intraspecies divergence

find tandem repeats in DNA from CRAM/VCF file

find tandem repeats in DNA from CRAM/VCF file 0 I want to find tandem repeats in DNA. I have access to CRAM file and the VCF file. I initially tried to get the insertions from the VCF file, but I am not sure if the variant caller has included all…

Continue Reading find tandem repeats in DNA from CRAM/VCF file

Change sample ID in BAM file to cell barcode

Change sample ID in BAM file to cell barcode 0 Hi all. I have a BAM file from one 10X scRNAseq sample. I want to try a tool out which was designed for a different type of data. The input for this tool is the output from samtools mpileup. samtools…

Continue Reading Change sample ID in BAM file to cell barcode

samtools calmd and original base quality

samtools calmd and original base quality 0 Hi, I’m currently trying to use samtools calmd to calculates MD and NM tags for my bam files, I noticed that the typical usage mentioned in the documentation (www.htslib.org/doc/samtools-calmd.html) is samtools calmd -bAr aln.bam > aln.baq.bam and the params -b -A -r means:…

Continue Reading samtools calmd and original base quality

Samtools Htslib Issues

Issue Title State Comments Created Date Updated Date How to get a specific chromosome open 1 2022-07-14 2022-07-18 tabix returns row from VCF file multiple times open 4 2022-07-11 2022-07-18 Modified base parsing failure failure closed 0 2022-07-01 2022-07-18 extract genotype information open 1 2022-06-24 2022-07-18 sam_hdr_remove_lines is inefficient if…

Continue Reading Samtools Htslib Issues

Within analysis, low-coverage whole-genome sequencing out of cfDNA was held to examine blood plasma away from patients with spine metastasis

Within analysis, low-coverage whole-genome sequencing out of cfDNA was held to examine blood plasma away from patients with spine metastasis An analysis pipe is made and you will verified to evaluate the brand new CNV condition within the cfDNA, in order to determine whether brand new CIN score, that has…

Continue Reading Within analysis, low-coverage whole-genome sequencing out of cfDNA was held to examine blood plasma away from patients with spine metastasis

sequencing – Interpreting ‘samtools mpileup’ output for multiple inputs

I would like to calculate sequencing coverage for a WGS project. Both long and short reads. I’ve used samtools as following: samtools mpileup -Q 1 -aa illumina_sorted.bam nanopore_sorted.bam > depth.txt Previously, when I used samtools depth instead, I only had the columns I was interested in (chromosome name / base…

Continue Reading sequencing – Interpreting ‘samtools mpileup’ output for multiple inputs

using ANNOVAR annotation clinvar database out wrong position

using ANNOVAR annotation clinvar database out wrong position 0 Hello Biostars, I was trying to annotate the VCF using ANNOVAR,but I get a wrong out ,it seems my clinvar database is not sutibale bcftools_callCommand=call -m -v -o /project/plantform/20220316PCR/03.amplify/L2107973CFD7G5kxT1/L2107973CFD7G5kxT1.variation.vcf /project/plantform/20220316PCR/03.amplify/L2107973CFD7G5kxT1/L2107973CFD7G5kxT1.mpileup.vcf clinvar ANNOVAR • 34 views Read more here: Source link

Continue Reading using ANNOVAR annotation clinvar database out wrong position

samtools mpileup error – 1 samples in 1 input files

samtools mpileup error – 1 samples in 1 input files 0 Hi All, I have relatively new to bioinformatics and have encountered an issue when trying to generate an mpileup file with samtools. I have entered the following command samtools mpileup -f /home/path_to_reference/nCoV_Jan31.fa.fasta sorted_sample1.sam > sample.mpileup The message returned is…

Continue Reading samtools mpileup error – 1 samples in 1 input files

How to call LOH with FreeC

How to call LOH with FreeC 0 Good morning, I am try to infer loss of heterozygosity (LOH) from WGS data using Freec. For this purpose, I am using these parameters in the “[BAF]” section of the configuration file: [BAF] makePileup = My_somaticVCF.vcf.gz fastaFile = hg19.fa SNPfile = hg19_snp142.SingleDiNucl.1based.txt.gz When…

Continue Reading How to call LOH with FreeC

Removing reads which map to certain region of reference

Removing reads which map to certain region of reference 0 I have mapped reads to a reference genome of a related species. I want to remove reads which map to a specific region (chromosome) of the reference, but I don’t know what the best way to go about it is….

Continue Reading Removing reads which map to certain region of reference

How to call variant by –max-depth for RNAseq

Hi everyone! I have a query regarding variant calling from a high coverage site on the basis of the maximum likelihood variant. I have RNA-seq data mapped bam file. I called variant using the below command. “bcftools mpileup –max-depth 10000 -Oz -f ref.fa sample.bam | bcftools call -mv -Oz -o…

Continue Reading How to call variant by –max-depth for RNAseq

Parallel genomic responses to historical climate change and high elevation in East Asian songbirds

Extreme environments present profound physiological stress. The adaptation of closely related species to these environments is likely to invoke congruent genetic responses resulting in similar physiological and/or morphological adaptations, a process termed “parallel evolution” (1). Existing evidence shows that parallel evolution is more common at the phenotypic level than at…

Continue Reading Parallel genomic responses to historical climate change and high elevation in East Asian songbirds

VCF samtools

VCF samtools 0 Hello, I am having trouble when doing variant calling with samtools. I am getting only the header an no variants. If I would instead use Freebayes, I do get a lot of variables, and with Gatk, I get just a few. What can the problem be? Do…

Continue Reading VCF samtools

Single-cell DNA and RNA sequencing reveals the dynamics of intra-tumor heterogeneity in a colorectal cancer model | BMC Biology

Organoid culture of small intestinal cells and lentiviral transduction C57BL/6J mice and BALB/cAnu/nu immune-deficient nude mice were purchased from CLEA Japan (Tokyo, Japan). The small intestine was harvested from wild-type male C57BL/6J mice at 3–5 weeks of age (Additional file 1: Figure S9A). Crypts were purified and dissociated into single cells,…

Continue Reading Single-cell DNA and RNA sequencing reveals the dynamics of intra-tumor heterogeneity in a colorectal cancer model | BMC Biology

The sardine run in southeastern Africa is a mass migration into an ecological trap

INTRODUCTION Large-scale annual migrations occur in an extraordinary range of animals, from insects to the great whales. While the driving mechanisms of these migrations are varied and sometimes poorly understood, they often represent a way of optimizing conditions for breeding and adult fitness when these are in conflict. Often, populations…

Continue Reading The sardine run in southeastern Africa is a mass migration into an ecological trap

samtools mpileup fail to create bcf

samtools mpileup fail to create bcf 1 I have indexed my reference.fasta using bowtie2: bowtie2-build reference.fasta reference.fasta created the bam file form the sam file using samtools, sorted and indexed the bam file: samtools view -S -b Sample1_mapped.sam > Sample1_mapped.bam samtools sort Sample1_mapped.bam -o Sample1_sorted > Sample1_sorted.bam samtools index Sample1_sorted.bam…

Continue Reading samtools mpileup fail to create bcf

phase_trio.sh | searchcode

phase_trio.sh | searchcode PageRenderTime 24ms CodeModel.GetById 16ms app.highlight 5ms RepoModel.GetById 1ms app.codeStats 0ms /Phase/phase_trio.sh github.com/BioinformaticsArchive/fCNV Shell |…

Continue Reading phase_trio.sh | searchcode

Bcftools how to add DP to FORMAT field (get per sample read depth for REF vs ALT alleles )

Bcftools how to add DP to FORMAT field (get per sample read depth for REF vs ALT alleles ) 1 I’m trying to achieve what this post was looking for Add Dp Tag To Genotype Field Of Vcf File Currently this is my command: bcftools mpileup -Ou –max-depth 8000 –min-MQ…

Continue Reading Bcftools how to add DP to FORMAT field (get per sample read depth for REF vs ALT alleles )

Vcfutils error code

Vcfutils error code 20-08-2021 code at line (I think) just to get it to write a proper fq. Second issue is this error: substr outside of string at /usr/local/bin/object91.ru line We can do this in a single…

Continue Reading Vcfutils error code

Calling variants on reads with MAPQ=0 on HaplotypeCaller or bcftools mpileup

Calling variants on reads with MAPQ=0 on HaplotypeCaller or bcftools mpileup 2 I am working with about 500 samples of human exome data. used hg19 to align my reads and ran a standard best-practices GATK workflow. Later only to realise that a small 1Mb loci has not mapped properly due…

Continue Reading Calling variants on reads with MAPQ=0 on HaplotypeCaller or bcftools mpileup

EOF marker absent in VCF

EOF marker absent in VCF – can this be safely ignored? 0 Hi, I generated a VCF file using a bcftools mpileup | bcftools call pipeline. I have done this before, and the file produced then looks fine. However, the log for this one had [W::bgzf_read_block] EOF marker is absent….

Continue Reading EOF marker absent in VCF

bcftools consensus still returns “Could not parse the header” error

bcftools consensus still returns “Could not parse the header” error 0 I attempted to create a consensus fasta file using bcftools, i.e. bgzip -c All_SRR_SNP_Clean.vcf > All_SRR_SNP_Clean.vcf.gz tabix All_SRR_SNP_Clean.vcf.gz cat $ref| bcftools consensus $vcf_dir/All_SRR_SNP_Clean.vcf.gz > consensus.fasta where $ref is the path to a Drosophila reference genome fa and the vcf…

Continue Reading bcftools consensus still returns “Could not parse the header” error

Extremely low number of variants in VCF file after filtering MIN(FORMAT/DP)>10

Extremely low number of variants in VCF file after filtering MIN(FORMAT/DP)>10 0 I’m doing microbiome analysis where I’m looking for SNPs in a large number of microbe species’ genomes. I ran my bcftools pipeline on around 15 bacterial and viral species from which the end result produced a number of…

Continue Reading Extremely low number of variants in VCF file after filtering MIN(FORMAT/DP)>10