Maximum Read Depth

Maximum Read Depth

3

Hi,

I wish to get the variants of my sequencing data using Samtools (vcfutils). For this I need to specify the maximum read depth in VarFilter function. I have no information of this from my Data. Can anybody tell me, how should I approach to its calculations, or considering default values to obtain SNPs in vcf format??

Thanking you


vcf


read


samtools

• 4.8k views

updated 2 hours ago by

▴

20

written 11.3 years ago by

▴

60

generally, maximal coverage for variant calling is set out of memory considerations, for this purpose i think the practice recommended in the GATK Unified Genotyper should be appropriate:

“…When running on projects with many samples at low coverage (e.g. 1000 Genomes with 4x coverage per sample) we usually lower this [the maximal coverage] value to about 10 times the average coverage (40x)…”

good luck

I have previously considered this. The ‘-D’ flag allows the program to avoid regions that have abnormally high coverage, e.g. PCR amplification errors. I calculated the mean/median coverage for the sample using BEDTools, which was also covered a previous Biostars question.

genomeCoverageBed -d -ibam file_sorted.bam -g genome_seq.fasta > file_sorted.1bp_coverage_inc0

This method does includes 0 counts for regions not covered by reads.

You can put the output in R or i used the Perl script at the bottom of the above link by Heikki.

To set ‘-D’ i used a value 3 to 5 x that of the mean coverage, you’ll to play with this to get the best result for your samples. I saw very little difference in the number of SNPs using any multiplier >3.

Bear in mind SNPs will not be reported for those regions with coverage > ‘-D’. Setting the value too high will result in too many false positives.

Also it is hightly recommended to locally realign reads to avoid false positive SNPs due to the presence of INDELs, e.g. SRMA or use of the GATK suite.

updated 3.6 years ago by

38k

written 11.3 years ago by

5.9k

Why don’t you just use it without setting this parameter. There is a default parameter value set to 1000.


Login
before adding your answer.

Traffic: 3643 users visited in the last hour

Read more here: Source link