Why Does The Chr1.Fa Fasta File Have A Bunch Of Ns And Why Is Some Of The Dna In Lower Case Vs. The Rest In Upper Case?
I have a couple of questions about the chr1.fa FASTA file at the link below:
Q1) Why does the beginning of the file have a whole bunch of N characters? The IUPAC code for DNA sequences says that N means any nucleotide base, so does this mean that the sequencer equipment could not correctly pull the 1-letter code for Chromosome 1’s beginning? Also, starting line 3550 or line 76,907 there are like a hundred more lines of Ns.
Q2) Why are parts of the DNA in lower case, while other parts are in upper case?
Link to the Chromosome 1 file:
• 4.5k views
Read more here: Source link