ps. sorry if my english is not too good, it is not my native language
No te preocupes – tu inglés es excelente.
The main input that you require for GISTIC is the segmentation file, which should have:
- (1) Sample (sample name)
- (2) Chromosome (chromosome number)
- (3) Start Position (segment start position, in bases)
- (4) End Position (segment end position, in bases)
- (5) Num markers (number of markers in segment)
- (6) Seg.CN (log2() -1 of copy number)
To go direct from Control-FREEC to GISTIC 2.0, I actually believe the best output file to use is the ‘*_ratio.bed‘ file, which can be produced by the
freec2bed.pl script (see bottom of THIS page, under section entitled ‘Translate Control-FREEC’s output into Bed or Circos formats‘).
However, you will have to convert the copy number column in the BED output via:
log2(x) - 1
The problem will be to determine a value for the ‘Num markers‘ column for the GISTIC input file. From Control FREEC, the reads per interval are stored in the *.cpn files, I believe, and these could be used as ‘pseudo-markers’. To find a way to overlap these with the BED file will be extra work for you, thoug – it could be done via complex BEDTools commands, or within R using GenomicRanges.
Para concluir, it is not impossible to use Control FREEC with GISTIC; however, it may be easier to use DNAcopy with the aligned BAM file and just avoid the use of Control FREEC. It is your choice.
NB – for GISTIC versions >2.0.23, no markers file is required.
Read more here: Source link