Categories
Tag: VDB
Most sensible way to find private SNPs from a multisamples vcf with bcftools
Hello, this question is somehow complementary to what I asked yesterday here: Using bcftools to find unique alt homozygous sites Now let’s say I want to find the SNPs 0/1 unique to the sample D3A350g_bcftools2 (see below) I know I can use bcftools view -s D3A350g_bcftools2.bcf -x all_bcftools2_merged.vcf But there…
Using bcftools to find unique alt homozygous sites
Hello, I have a vcf with 20 samples. I want to find for each sample the sites that are 1/1, only in that sample (so other samples must have genotypes 0/1 or 0/0). I know I can use filters such as GT=”aa”‘ However, how do I say GT=”aa” for sample…
ncbi error report log for validate fastq issue
ncbi error report log for validate fastq issue 0 Im trying to fetch a list of GSM id which could be seen that it is present in the project folder which I checked through sra explorer tool but when I try to download through a script it fails even after…
What We Know So Far
At the Google I/O developer conference in May 2023, CEO Sundar Pichai announced the company’s upcoming artificial intelligence (AI) system, Gemini. The large language model (LLM) is being developed by the Google DeepMind division (Brain Team + DeepMind). It could compete with AI systems like ChatGPT from OpenAI and possibly…
Modifying the vdb-config to increase timeout in a docker image
This the docker image I want to modify. My objective is to include this in the container in other words increase timeout # set timeout to 10 seconds $ vdb-config -s /http/timeout/read=10000 So Steps what I have done is – docker pull biomystery/sra-tools-pigz:2.10.9 – docker run -it -d –name kcm_sra_tool…
How to create custom docker image to download bam files from SRA
How to create custom docker image to download bam files from SRA 1 I am trying to create a docker image from docker file and entrypoint. However, it only generates <none>:<none>. This is my docker file and entrypoint.sh. Would someone please let me know what’s missing here? FROM openjdk:8-jre LABEL…
How to extract read counts at the mutation locations
I have a scDNAseq dataset having multiple FASTQ files for multiple single cells. samtools was used after aligning FASTQ files with BWA to hg19 reference to produce bam files. I have already identified 36 SNV mutation sites and I want to use mpileup to extract read counts (Total read count…
Enabling accurate and early detection of recently emerged SARS-CoV-2 variants of concern in wastewater
Wastewater sample collection, RNA extraction, and sequencing Houston Water collected and provided weekly 24-hour time-weighted composite influent (raw wastewater) samples from 39 wastewater treatment plants (WWTPs) in Houston covering a service area of approximately 580 miles2 and serving over 2.3 million people. In total, 2637 samples were analyzed. Untreated wastewater…
Removing indels in VCF file
Hello, I am trying to do something very simple, but running into confusing behaviour. I have a VCF file of multiple samples and want to remove all indels so that I can generate sequences with identical coordinates with bcftools consensus. I removed indels by specifying bcftools view –include ‘TYPE=”snp” ||…
bcftools get allele abundance
I’m using bcftools to extract variants from a bam file, but I have reference data that tells me whether the patient is homozygous or heterozygous. For a particular sample, I see a high proportion of the alternate allele (87%) and a lower proportion of the reference allele (13%), yet according…
Prefecth-orig and fasterq-dump not working when downloading SRA files (v3.0.5)
Hi everyone! I am having some problems when I try to download SRA files. Yesterday I was trying to download a set of SRA files using SRA Toolkit 3.0.2 and it didn’t work. I thought that it was a problem with the version, so I just installed the new one…
What is the major problem with this pipeline of SNPs analysis?
First, I have several Aspergillus flavus (A kind of fungi species) illumina sequencing raw data as pair of fastq.gz file (sample1_filtered_1.fastq.gz and sample1_filtered_2.fastq.gz). And I wanted to assemble illumina fragment sequences and make SNP(single nucleotide polymorphism) analysis with the reference genome, Aspergillus flavus NRRL3357 as fasta file. At the end…
Variant caller reports a homozygous variant genotype, but more reads are associated with reference
Variant caller reports a homozygous variant genotype, but more reads are associated with reference 0 Hi there, I’m confused about how to interpret this output from calling variants using bcftools: #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT GSM5292852 1065632 chr9 41242177 . T C 6.65947 . DP=37;VDB=0.133454;SGB=-0.662043;RPBZ=2.91136;MQBZ=4.0715;BQBZ=1.05041;SCBZ=-0.480069;MQ0F=0;AC=2;AN=2;DP4=0,26,0,9;MQ=5 GT:PL:AD…
No genotype likelihoods when doing SNP calling using bcftools
Hello everyone, I am trying to get genotype likelihoods using bcftools. I am using bcftools version 1.11, running bcftools mpileup and bcftools call. This is what I run: bcftools mpileup -d 8000 -Ou -f $reference $input | bcftools call -mv -Ob -o $variants However, when I check the columns INFO…
fastq-dump split-spot and skip-technical
Hi All, Anyone can make a clear interpretation to the split-spot and skip-technical option in fastq-dump? –split-spot Split spots into individual reads. (**I guess this option split each read to two parts**) –skip-technical Dump only biological reads. What does –split-spot can be used? What’s principle of skip-technical? What’s difference between…
zero byte files in sratoolkit.3.0.1-ubuntu64
zero byte files in sratoolkit.3.0.1-ubuntu64 0 I have downloaded sratoolkit for ubuntu. After getting the tar file and unzip it, I added the bin path to the Path variables as well. When I run which fastq-dump, it correctly identifies the path. However, when I run vdb-config -i or vdb-config –interactive…
sra-toolkit not working
sra-toolkit not working 2 I downloaded sra-toolkit from sra website for 64bit Windows. After extracting I tested it from bin folder: sra-toolkit\bin> fastq-dump –stdout -X 2 SRR390728 but it throws error: 2015-12-15T09:54:40 fastq-dump.2.5.5 err: item not found while constructing within virtual database module – the path ‘SRR390728’ cannot be opened…
links to Ensembl GRCh37 – gitmetadata
Open Targets Genetics reports GRCh38 coordinates but ‘External references” section points to GRCh37 (grch37.ensembl.org) rather than GRCh38 (www.ensembl.org): genetics.opentargets.org/variant/8_102432699_T_C Was this a deliberate decision (e.g. we don’t have the rsID in GRCh38 for some reason, other)? If so, we need to make this clear in the docs. If not, we…
How to call variant by –max-depth for RNAseq
Hi everyone! I have a query regarding variant calling from a high coverage site on the basis of the maximum likelihood variant. I have RNA-seq data mapped bam file. I called variant using the below command. “bcftools mpileup –max-depth 10000 -Oz -f ref.fa sample.bam | bcftools call -mv -Oz -o…
bcftools merge GP format issues
Hello, I am trying to merge VCF files from several samples from different sequencing runs. I ran bcftools merge on the VCF files and after ten hours I got the error message “Incorrect number of FORMAT/GP values at chr_Y:216795, cannot merge. The tag is defined as Number=G, but found 2…
converting SAM file to SRA
converting SAM file to SRA 0 Does Anybody know how to convert SAM files to equivalent SRA ? I downloaded a SRA file and used sam-dump to convert it to sam format. Thereafter, I used bam-load to convert sam file to SRA. The reference fasta and configuration files are specified…
Downloading authorized data from dbGAP leads to a “looped” message
Downloading authorized data from dbGAP leads to a “looped” message 0 Hello everyone, i am trying to download authorized dbGAP data. My PI gave me the .ngc key and the necessary kart file. I used this as a guide to configure vdb-config: github.com/ncbi/sra-tools/wiki/05.-Toolkit-Configuration I then tried to use prefetch to…
SRA Toolkit Error while using prefetch
SRA Toolkit Error while using prefetch 1 After installing SRA-toolkit I downloaded an SRA file and when I tried to download another SRA file then it gave me the following error: prefetch:` symbol lookup error: /lib/x86_64-linux-gnu/libncbi-vdb.so.2: undefined symbol: vdb_mbedtls_md_setup How to fix this? sra-toolkit • 20 views • link •…
blastn_vdb Error: Cannot open VDB:
blastn_vdb Error: Cannot open VDB: 0 I’m working locally with sra-toolkit. I have downloaded a sra files, validate it with vdb-validate and when I try to do a search with blastn-dbd the following error occurs Error: Cannot get VDB column: ./SRR7443983.sra.SEQUENCE.ACC_PREFIX Anyone knows how to fix it? Thanks blastn_vdb •…
Bcftools how to add DP to FORMAT field (get per sample read depth for REF vs ALT alleles )
Bcftools how to add DP to FORMAT field (get per sample read depth for REF vs ALT alleles ) 1 I’m trying to achieve what this post was looking for Add Dp Tag To Genotype Field Of Vcf File Currently this is my command: bcftools mpileup -Ou –max-depth 8000 –min-MQ…
VcfSampleCompare – empty output with warnings
Hello to all, I have 10 vcf files – 5 female fish and 5 male fish, I have merged all 10 fish to one vcf file.(all_fish.vcf) I performed the VcfSampleCompare analysis on ‘all_fish.vcf’ , following this:github.com/hepcat72/vcfSampleCompare The example command : vcfSampleCompare.pl –sample-group ‘wt1 wt2 wt3’ –sample-group ‘mut1 mut2 mut3’ input.vcf…