Why Does The Chr1.Fa Fasta File Have A Bunch Of Ns And Why Is Some Of The Dna In Lower Case Vs. The Rest In Upper Case?

Why Does The Chr1.Fa Fasta File Have A Bunch Of Ns And Why Is Some Of The Dna In Lower Case Vs. The Rest In Upper Case?

1

Hi,

I have a couple of questions about the chr1.fa FASTA file at the link below:

Q1) Why does the beginning of the file have a whole bunch of N characters? The IUPAC code for DNA sequences says that N means any nucleotide base, so does this mean that the sequencer equipment could not correctly pull the 1-letter code for Chromosome 1’s beginning? Also, starting line 3550 or line 76,907 there are like a hundred more lines of Ns.

Q2) Why are parts of the DNA in lower case, while other parts are in upper case?

Link to the Chromosome 1 file:
hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/?C=S;O=A


fasta

• 4.5k views

updated 2 hours ago by

0

written 8.6 years ago by

▴

10

Read more here: Source link