How to fasterq-dump 10x genomics snATACseq fastq from SRA
I am trying to retrieve fastq files from a 10x genomics snATACseq dataset on SRA.
Each run should have 4 fastq files associated with it:
I1: Dual index i7 read (optional)
R1: Read 1
R2: Dual index i5 read
R3: Read 2
These four fastq files were definitely uploaded to SRA (check the data access tab for one of the runs: Link to SRA)
BAM files were not uploaded, so I can’t use bam2fastq.
I’ve downloaded the .sra files, but when I try to dump them using fasterq-dump (I’ve tried every option) it only outputs two files: R1 and R2, which do not necessarily correspond to the R1 and R2 mentioned above.
If the files have been demultiplexed, you should only need R1 and R2. However, you will need to change the names of the fastqs so that they look like the kinds of names that bcl2fastq gives them. 10X used to have a link to the naming definition, it looks like it’s broken. Just make sure that your file names end like this:
Change the bold part to what you want (no funny characters, naturally), leave the rest exactly like that.