Tag: VDB

Most sensible way to find private SNPs from a multisamples vcf with bcftools

Hello, this question is somehow complementary to what I asked yesterday here: Using bcftools to find unique alt homozygous sites Now let’s say I want to find the SNPs 0/1 unique to the sample D3A350g_bcftools2 (see below) I know I can use bcftools view -s D3A350g_bcftools2.bcf -x all_bcftools2_merged.vcf But there…

Continue Reading Most sensible way to find private SNPs from a multisamples vcf with bcftools

Using bcftools to find unique alt homozygous sites

Hello, I have a vcf with 20 samples. I want to find for each sample the sites that are 1/1, only in that sample (so other samples must have genotypes 0/1 or 0/0). I know I can use filters such as GT=”aa”‘ However, how do I say GT=”aa” for sample…

Continue Reading Using bcftools to find unique alt homozygous sites

ncbi error report log for validate fastq issue

ncbi error report log for validate fastq issue 0 Im trying to fetch a list of GSM id which could be seen that it is present in the project folder which I checked through sra explorer tool but when I try to download through a script it fails even after…

Continue Reading ncbi error report log for validate fastq issue

What We Know So Far

At the Google I/O developer conference in May 2023, CEO Sundar Pichai announced the company’s upcoming artificial intelligence (AI) system, Gemini. The large language model (LLM) is being developed by the Google DeepMind division (Brain Team + DeepMind). It could compete with AI systems like ChatGPT from OpenAI and possibly…

Continue Reading What We Know So Far

Modifying the vdb-config to increase timeout in a docker image

This the docker image I want to modify. My objective is to include this in the container in other words increase timeout # set timeout to 10 seconds $ vdb-config -s /http/timeout/read=10000 So Steps what I have done is – docker pull biomystery/sra-tools-pigz:2.10.9 – docker run -it -d –name kcm_sra_tool…

Continue Reading Modifying the vdb-config to increase timeout in a docker image

How to create custom docker image to download bam files from SRA

How to create custom docker image to download bam files from SRA 1 I am trying to create a docker image from docker file and entrypoint. However, it only generates <none>:<none>. This is my docker file and entrypoint.sh. Would someone please let me know what’s missing here? FROM openjdk:8-jre LABEL…

Continue Reading How to create custom docker image to download bam files from SRA

How to extract read counts at the mutation locations

I have a scDNAseq dataset having multiple FASTQ files for multiple single cells. samtools was used after aligning FASTQ files with BWA to hg19 reference to produce bam files. I have already identified 36 SNV mutation sites and I want to use mpileup to extract read counts (Total read count…

Continue Reading How to extract read counts at the mutation locations

Enabling accurate and early detection of recently emerged SARS-CoV-2 variants of concern in wastewater

Wastewater sample collection, RNA extraction, and sequencing Houston Water collected and provided weekly 24-hour time-weighted composite influent (raw wastewater) samples from 39 wastewater treatment plants (WWTPs) in Houston covering a service area of approximately 580 miles2 and serving over 2.3 million people. In total, 2637 samples were analyzed. Untreated wastewater…

Continue Reading Enabling accurate and early detection of recently emerged SARS-CoV-2 variants of concern in wastewater

Removing indels in VCF file

Hello, I am trying to do something very simple, but running into confusing behaviour. I have a VCF file of multiple samples and want to remove all indels so that I can generate sequences with identical coordinates with bcftools consensus. I removed indels by specifying bcftools view –include ‘TYPE=”snp” ||…

Continue Reading Removing indels in VCF file

bcftools get allele abundance

I’m using bcftools to extract variants from a bam file, but I have reference data that tells me whether the patient is homozygous or heterozygous. For a particular sample, I see a high proportion of the alternate allele (87%) and a lower proportion of the reference allele (13%), yet according…

Continue Reading bcftools get allele abundance

Prefecth-orig and fasterq-dump not working when downloading SRA files (v3.0.5)

Hi everyone! I am having some problems when I try to download SRA files. Yesterday I was trying to download a set of SRA files using SRA Toolkit 3.0.2 and it didn’t work. I thought that it was a problem with the version, so I just installed the new one…

Continue Reading Prefecth-orig and fasterq-dump not working when downloading SRA files (v3.0.5)

What is the major problem with this pipeline of SNPs analysis?

First, I have several Aspergillus flavus (A kind of fungi species) illumina sequencing raw data as pair of fastq.gz file (sample1_filtered_1.fastq.gz and sample1_filtered_2.fastq.gz). And I wanted to assemble illumina fragment sequences and make SNP(single nucleotide polymorphism) analysis with the reference genome, Aspergillus flavus NRRL3357 as fasta file. At the end…

Continue Reading What is the major problem with this pipeline of SNPs analysis?

Variant caller reports a homozygous variant genotype, but more reads are associated with reference

Variant caller reports a homozygous variant genotype, but more reads are associated with reference 0 Hi there, I’m confused about how to interpret this output from calling variants using bcftools: #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT GSM5292852 1065632 chr9 41242177 . T C 6.65947 . DP=37;VDB=0.133454;SGB=-0.662043;RPBZ=2.91136;MQBZ=4.0715;BQBZ=1.05041;SCBZ=-0.480069;MQ0F=0;AC=2;AN=2;DP4=0,26,0,9;MQ=5 GT:PL:AD…

Continue Reading Variant caller reports a homozygous variant genotype, but more reads are associated with reference

No genotype likelihoods when doing SNP calling using bcftools

Hello everyone, I am trying to get genotype likelihoods using bcftools. I am using bcftools version 1.11, running bcftools mpileup and bcftools call. This is what I run: bcftools mpileup -d 8000 -Ou -f $reference $input | bcftools call -mv -Ob -o $variants However, when I check the columns INFO…

Continue Reading No genotype likelihoods when doing SNP calling using bcftools

fastq-dump split-spot and skip-technical

Hi All, Anyone can make a clear interpretation to the split-spot and skip-technical option in fastq-dump? –split-spot Split spots into individual reads. (**I guess this option split each read to two parts**) –skip-technical Dump only biological reads. What does –split-spot can be used? What’s principle of skip-technical? What’s difference between…

Continue Reading fastq-dump split-spot and skip-technical

zero byte files in sratoolkit.3.0.1-ubuntu64

zero byte files in sratoolkit.3.0.1-ubuntu64 0 I have downloaded sratoolkit for ubuntu. After getting the tar file and unzip it, I added the bin path to the Path variables as well. When I run which fastq-dump, it correctly identifies the path. However, when I run vdb-config -i or vdb-config –interactive…

Continue Reading zero byte files in sratoolkit.3.0.1-ubuntu64

sra-toolkit not working

sra-toolkit not working 2 I downloaded sra-toolkit from sra website for 64bit Windows. After extracting I tested it from bin folder: sra-toolkit\bin> fastq-dump –stdout -X 2 SRR390728 but it throws error: 2015-12-15T09:54:40 fastq-dump.2.5.5 err: item not found while constructing within virtual database module – the path ‘SRR390728’ cannot be opened…

Continue Reading sra-toolkit not working

links to Ensembl GRCh37 – gitmetadata

Open Targets Genetics reports GRCh38 coordinates but ‘External references” section points to GRCh37 (grch37.ensembl.org) rather than GRCh38 (www.ensembl.org): genetics.opentargets.org/variant/8_102432699_T_C Was this a deliberate decision (e.g. we don’t have the rsID in GRCh38 for some reason, other)? If so, we need to make this clear in the docs. If not, we…

Continue Reading links to Ensembl GRCh37 – gitmetadata

How to call variant by –max-depth for RNAseq

Hi everyone! I have a query regarding variant calling from a high coverage site on the basis of the maximum likelihood variant. I have RNA-seq data mapped bam file. I called variant using the below command. “bcftools mpileup –max-depth 10000 -Oz -f ref.fa sample.bam | bcftools call -mv -Oz -o…

Continue Reading How to call variant by –max-depth for RNAseq

bcftools merge GP format issues

Hello, I am trying to merge VCF files from several samples from different sequencing runs. I ran bcftools merge on the VCF files and after ten hours I got the error message “Incorrect number of FORMAT/GP values at chr_Y:216795, cannot merge. The tag is defined as Number=G, but found 2…

Continue Reading bcftools merge GP format issues

converting SAM file to SRA

converting SAM file to SRA 0 Does Anybody know how to convert SAM files to equivalent SRA ? I downloaded a SRA file and used sam-dump to convert it to sam format. Thereafter, I used bam-load to convert sam file to SRA. The reference fasta and configuration files are specified…

Continue Reading converting SAM file to SRA

Downloading authorized data from dbGAP leads to a “looped” message

Downloading authorized data from dbGAP leads to a “looped” message 0 Hello everyone, i am trying to download authorized dbGAP data. My PI gave me the .ngc key and the necessary kart file. I used this as a guide to configure vdb-config: github.com/ncbi/sra-tools/wiki/05.-Toolkit-Configuration I then tried to use prefetch to…

Continue Reading Downloading authorized data from dbGAP leads to a “looped” message

SRA Toolkit Error while using prefetch

SRA Toolkit Error while using prefetch 1 After installing SRA-toolkit I downloaded an SRA file and when I tried to download another SRA file then it gave me the following error: prefetch:` symbol lookup error: /lib/x86_64-linux-gnu/libncbi-vdb.so.2: undefined symbol: vdb_mbedtls_md_setup How to fix this? sra-toolkit • 20 views • link •…

Continue Reading SRA Toolkit Error while using prefetch

blastn_vdb Error: Cannot open VDB:

blastn_vdb Error: Cannot open VDB: 0 I’m working locally with sra-toolkit. I have downloaded a sra files, validate it with vdb-validate and when I try to do a search with blastn-dbd the following error occurs Error: Cannot get VDB column: ./SRR7443983.sra.SEQUENCE.ACC_PREFIX Anyone knows how to fix it? Thanks blastn_vdb •…

Continue Reading blastn_vdb Error: Cannot open VDB:

Bcftools how to add DP to FORMAT field (get per sample read depth for REF vs ALT alleles )

Bcftools how to add DP to FORMAT field (get per sample read depth for REF vs ALT alleles ) 1 I’m trying to achieve what this post was looking for Add Dp Tag To Genotype Field Of Vcf File Currently this is my command: bcftools mpileup -Ou –max-depth 8000 –min-MQ…

Continue Reading Bcftools how to add DP to FORMAT field (get per sample read depth for REF vs ALT alleles )

VcfSampleCompare – empty output with warnings

Hello to all, I have 10 vcf files – 5 female fish and 5 male fish, I have merged all 10 fish to one vcf file.(all_fish.vcf) I performed the VcfSampleCompare analysis on ‘all_fish.vcf’ , following this:github.com/hepcat72/vcfSampleCompare The example command : vcfSampleCompare.pl –sample-group ‘wt1 wt2 wt3’ –sample-group ‘mut1 mut2 mut3’ input.vcf…

Continue Reading VcfSampleCompare – empty output with warnings