Index with unmasked or masked in HISAT2
Hello everyone,
I have a couple of doubts about the query source and the genome that I have to use to create an index to align with HISAT2.
The first is whether it is correct to build the index with a “top level” file from Ensembl Plants (ftp.ebi.ac.uk/ensemblgenomes/pub/release-55/plants/fasta/zea_mays/dna/) or use the one from NCBI (www.ncbi.nlm.nih.gov/genome/?term=zea+mays)
If the correct thing is to use any of the Ensembl Plants, which would be the most ideal?
-
Zea_mays.Zm-B73-REFERENCE-NAM-5.0.dna.toplevel.fa.gz 615M
-
Zea_mays.Zm-B73-REFERENCE-NAM-5.0.dna_rm.toplevel.fa.gz
123M -
Zea_mays.Zm-B73-REFERENCE-NAM-5.0.dna_sm.toplevel.fa.gz 641M
Description:
‘dna’ – unmasked genomic DNA.
‘dna_rm’ – masked genomic DNA.
‘dna_sm’: masked genomic DNA.
Could you help me clarify my doubts please?
• 155 views
Traffic: 1725 users visited in the last hour
Read more here: Source link