Generating variant read count matrix, total read count matrix and binary/ternary mutaion matrix for SNV from scDNAseq FASTQ files

Generating variant read count matrix, total read count matrix and binary/ternary mutaion matrix for SNV from scDNAseq FASTQ files

0

Leung et al., 2017 paper mentioned in Fig 1 data processing for CRC patients was sequenced as single cell for both SNV (with MDA WGA) and CNA (with DOP-PCR) parallelly. But from the SRA accession especially for CRC2(CO8) patient having 240 FASTQ files and two types of Library Selection:

  • PCR Library Selection having 198 FASTQ files
  • Random PCR Library Selection having 42 FASTQ files

I don’t understand which single cells were sequenced for SNV analysis and which were for CNA analysis.
I want to reproduce the following for the SNV mutation matrix from the scDNAseq FASTQ files from the SRA accession

  • variant read count matrix,
  • total read count matrix and
  • binary/ternary mutation matrix

upon generating the matrices the goal is to reconstruct Tumor Phylogeny.

NB: Followup question : I tried to download FASTQ files and then align each file with the human reference genome with BWA/Bowtie2 and then create each alignment as bam file using samtools.

Now while variant calling should I go with single cell specific variant caller or just use the variant caller that generally calls from bulk DNA sequencing datasets such as Mutect2, VarScan, GATK etc?


SNV


scDNAseq


mutationMatrix


tumorPhylogney


mutaion

• 22 views

Read more here: Source link