UCSC liftover

UCSC liftover

2

Hi, I’m using UCSC liftover to convert hg19 to hg38.

The result came out that I don’t understand.

Feb. 2009 (GRCh37/hg19) → Dec. 2013 (GRCh38/hg38) – chr1:120904787 → chr1:143905854

Dec. 2013 (GRCh38/hg38) → Feb. 2009 (GRCh37/hg19) – chr1:143905854 → chr1:149400430

(I didn’t check “Allow multiple output regions”.)

I think the value of chr1:120904787 and chr1:149400430 should be the same value, not different.

The result was the same even if I downloaded the chain file (hg19ToHg38.over.chain.gz) and used the PyLiftover.

If you know the reason, please reply.

Thanks.


UCSC


PyLiftover


liftover

• 272 views

updated 2 hours ago by

▴

420

written 7 weeks ago by

▴

10

The original mapping does not appear to be correct. This is the original GRCh37 region; you can see FAM72B to the left, FCGR1B to the right, with NOTCH2 further left and SRGAP2C further right. The region UCSC have mapped it to contains genes that appear to be paralogues, FAM72C and SRGAP2D.

The second mapping, back to GRCh37, seems to be correct, as this region on GRCh37 contains FAM72C.

The correct mapping of your original locus should be to 1:121118498 which has NOTCH and FCGR1B to the left and FAM72B and SRGAP2C to the right.

Both regions are quite difficult as they both contain gaps in GRCh37 which have been filled in GRCh38 with some reorganisation (ie the two genes FAM72B and FCGR1B have flipped over). They also appear to be paralogous regions with similar genes in them.

As Emily says, this is a difficult region. This region was patched in GRCh37.p8, you can read more about that here (genomeref.blogspot.com/2012/05/filling-in-gaps-to-better-understand.html). You can see the patch sequence in the following session (genome.ucsc.edu/s/Lou/fixRegion) with the ID: JH636053.1

This region contained many gaps, and actually was mapped with the wrong orientation. You can see the patch sequenced placed on its chromosome (genome.ucsc.edu/s/Lou/patchOrientation) and see that H3P4 is located downstream of FAM72B. Opposite of what you see in my first session.

A better approach for these troublesome regions is to search for the gene of interest directly on hg38. This leads to the correct area ~chr1:121,054,532. In this region you’ll also see H3P4 downstream of FAM72B, as this was corrected in the updated assembly: genome.ucsc.edu/s/Lou/hg38H3P4

Lifting is not a perfect science, and issues are magnified when there are assembly errors. Let us know if you encounter additional troublesome regions and we can investigate further.

If you have any follow up questions, our help desk can always be reached at genome@soe.ucsc.edu. You may also send questions to genome-www@soe.ucsc.edu if they contain sensitive data. For any Genome Browser questions on Biostars, the UCSC tag is the best way to ensure visibility by the team.


Login
before adding your answer.

Traffic: 1406 users visited in the last hour

Read more here: Source link