How to fasterq-dump 10x genomics snATACseq fastq from SRA

How to fasterq-dump 10x genomics snATACseq fastq from SRA

2

I am trying to retrieve fastq files from a 10x genomics snATACseq dataset on SRA.
Each run should have 4 fastq files associated with it:

I1: Dual index i7 read (optional)
R1: Read 1
R2: Dual index i5 read
R3: Read 2

These four fastq files were definitely uploaded to SRA (check the data access tab for one of the runs: Link to SRA)
BAM files were not uploaded, so I can’t use bam2fastq.

I’ve downloaded the .sra files, but when I try to dump them using fasterq-dump (I’ve tried every option) it only outputs two files: R1 and R2, which do not necessarily correspond to the R1 and R2 mentioned above.

How do I get all four fastq files from SRA?


fasterq-dump


snATACseq


scATACseq


sra


10x

• 19 views

With

$ fastq-dump --split-files -F  SRR11858618

I get 4 expected files. You should be able to figure how to rename them so they are R1,R2,R3,I1 after the extraction is complete.

$ more SRR11858618_*
::::::::::::::
SRR11858618_1.fastq
::::::::::::::
@A00325:101:HF727DRXX:1:1101:1208:1016
ACGGGACT
+A00325:101:HF727DRXX:1:1101:1208:1016
FFFFFFFF
@A00325:101:HF727DRXX:1:1101:2003:1016
ACGGGACT
+A00325:101:HF727DRXX:1:1101:2003:1016
FFFFFFFF
::::::::::::::
SRR11858618_2.fastq
::::::::::::::
@A00325:101:HF727DRXX:1:1101:1208:1016
TNTAAGATCAATGTTCTAAAAAAGTGACAAAACCTCAGTGTTTCTTTCCT
+A00325:101:HF727DRXX:1:1101:1208:1016
F#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFF
@A00325:101:HF727DRXX:1:1101:2003:1016
ANTAGGAACAGTCCTTCCAACACAGATTAGGTTCATTGGGAACACATGCA
+A00325:101:HF727DRXX:1:1101:2003:1016
F#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
::::::::::::::
SRR11858618_3.fastq
::::::::::::::
@A00325:101:HF727DRXX:1:1101:1208:1016
NCTATTGTCTTAGTGG
+A00325:101:HF727DRXX:1:1101:1208:1016
#FFFFFFFF:FFFFFF
@A00325:101:HF727DRXX:1:1101:2003:1016
NAGCGCTGTTGCAGAG
+A00325:101:HF727DRXX:1:1101:2003:1016
#FFFFFFFFFFFFFFF
::::::::::::::
SRR11858618_4.fastq
::::::::::::::
@A00325:101:HF727DRXX:1:1101:1208:1016
GTGCAGGTCAGGCTCCGGTAAGGAATGCGTGAAACTCAGTTTCTAAAGG
+A00325:101:HF727DRXX:1:1101:1208:1016
FFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFF:FFFFF
@A00325:101:HF727DRXX:1:1101:2003:1016
GTCCGTCTGTCCCAGAAGTCCCAGCTCCTTTCCTGCTCTGGCACCTCCT
+A00325:101:HF727DRXX:1:1101:2003:1016
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

If the files have been demultiplexed, you should only need R1 and R2. However, you will need to change the names of the fastqs so that they look like the kinds of names that bcl2fastq gives them. 10X used to have a link to the naming definition, it looks like it’s broken. Just make sure that your file names end like this:

pbmc_1k_v3_S1_L001_R1_001.fastq.gz

pbmc_1k_v3_S1_L001_R2_001.fastq.gz

Change the bold part to what you want (no funny characters, naturally), leave the rest exactly like that.


Login
before adding your answer.

Traffic: 2272 users visited in the last hour

Read more here: Source link