How to demultiplex single-end dual indexed run

Hi all,

I have data from a single-end run with a dual index structure generated by the NextSeq 500 instrument. I want to do the demultiplexing by bcl2fastq tool. How should my SampleSheet.csv structure and bcl2fastq script look like?

I used aSampleSheet structure (shared below) with a bcl2fastq command (mentioned below) and I got only 18% of reads undetermined which I think is a good ratio (total number of reads was around 76 million and ~14 million reads were undetermined).

The SampleSheet.csv structure. The index column is for index i7 and the index2 column is for index i5.

[Header],,,
[Reads],,,
[Settings],,,
adapter,,,
,,,
[Data],,,
Sample_ID,Sample_Name,Description,index,index2
1_mESCs,1_mESCs,,AACCGCGG,CTAGCGCT
2_mESCs,2_mESCs,,GGTTATAA,CTAGCGCT

The bcl2fastq script is as below.

bcl2fastq --runfolder-dir --output-dir --sample-sheet  --barcode-mismatches 0

Is the overall approach correct? Should I include the --use-bases-mask option as well?

Thank you for your input.

Read more here: Source link