How to obtain a segmentation file from Control-FREEC output to use with GISTIC

ps. sorry if my english is not too good, it is not my native language
XD

No te preocupes – tu inglés es excelente.

——————

The main input that you require for GISTIC is the segmentation file, which should have:

  • (1) Sample (sample name)
  • (2) Chromosome (chromosome number)
  • (3) Start Position (segment start position, in bases)
  • (4) End Position (segment end position, in bases)
  • (5) Num markers (number of markers in segment)
  • (6) Seg.CN (log2() -1 of copy number)

To go direct from Control-FREEC to GISTIC 2.0, I actually believe the best output file to use is the ‘*_ratio.bed‘ file, which can be produced by the freec2bed.pl script (see bottom of THIS page, under section entitled ‘Translate Control-FREEC’s output into Bed or Circos formats‘).

However, you will have to convert the copy number column in the BED output via:

log2(x) - 1

The problem will be to determine a value for the ‘Num markers‘ column for the GISTIC input file. From Control FREEC, the reads per interval are stored in the *.cpn files, I believe, and these could be used as ‘pseudo-markers’. To find a way to overlap these with the BED file will be extra work for you, thoug – it could be done via complex BEDTools commands, or within R using GenomicRanges.

<h6>#</h6>

Para concluir, it is not impossible to use Control FREEC with GISTIC; however, it may be easier to use DNAcopy with the aligned BAM file and just avoid the use of Control FREEC. It is your choice.

Hasta pronto

Kevin

NB – for GISTIC versions >2.0.23, no markers file is required.

Read more here: Source link