Cell Ranger count pipeline: use files with same name as input


Hello, I have a question regarding the input of several fastq files into ‘CellRanger count’ pipeline.

I performed scRNA-seq of different samples at a partner institute and the sequencing facility started by sequencing all the samples at a lower depth (to test the quality of the libraries) and only then performed a second sequencing at higher depth. For this reason, I ended up with two sets of files for each sample, which they assured me could be merged during data analysis (and in this manner the final sequencing depth would be the sum of that obtained in each sequencing event). The issue is that these 2 sets of files have the exact same name (e.g. ‘sample12_S11_L002_R1_001.fastq.gz’), and I don’t know if I can give them all as input for ‘CellRanger count’ or if the software will be confused by having duplicated file names. I am also not sure if I can just change the name of fastq.gz files to make them unique and solve this issue.

Did anyone ever run into this issue or do you have any idea of how the pipeline will deal with this?

Additionally, would it be more correct to run the 2 sets of files separately through ‘CellRanger count’ and then analyse them together in Seurat? I’m not very keen on using ‘CellRanger aggr’ because their normalization is not the same as performed by Seurat, and I would prefer to process (filter, normalize…) all the count matrices in Seurat.

