gender determination and chrX CN calls


I’m running CNVKit in amplicon mode on a set of tumor bam files generated with a small amplicon panel of 45 genes. The panel includes just one gene on chrX, and none on chrY. My reference is generated by 10 normal male samples sequenced with the same panel.

Initially I ran the pipeline using -y at all appropriate steps. The reference samples are correctly assumed male, but almost all of my tumor samples are also treated as male. I then ran the same pipeline but without the -y. The reference samples are still assumed male, but now I have a far more even breakdown of male and female in the sample set.

I prefer the output without the -y.

My understanding is that without -y the log2 ratios for male reference samples will be doubled. I confirmed that the diploid chrX log2 ratios are +1 with respect to the haploid chrX ratios. Since there is only one chrX region and no chrY region covered, it seems automatic determination of sample gender will be confounded if there is a CN change in that X region.

Are there any special considerations I should bear in mind with only one gene on chrX and no genes on chrY sequenced?

How, if at all, would incorrect gender identification for a sample affect CN calls on chrX?






You are totally right, I against the same problem and I still cannot solve.
Did you find any solution for that?

In my case generally reference samples are mixed (include male and female) and I do not use -y option. Normally in calling step, I am using thresholds like that:

  • for cn = 0 (loss) ——> -1
  • for cn = 1 (no alteration) ——> 0.5849625
  • for cn = 2 (gain) —–> 1.321928
  • for cn = 3 (gain) ——> 1.807355

but these thresholds do not work for sex chromosomes. I couldn’t find any default thresholds for sex chromosomes in documentary…

