poor classification using qiime2 – User Support

Good morning,

I am experiencing some difficultie sto get results even if indeed my pipeline has not changed.
In specific what I obtain is kind of poor classification: half of the sequences (very low number of OTU in addition (e.g 900) are just attributed to Bacteria or OD1. So I think this is not a great result.

I include my commands

taxa_classi:
$(CONDA_ACTIVATE) Miqiime2-2021.8;
qiime feature-classifier classify-sklearn
–i-classifier gg-13-8-99-nb-classifier.qza
–i-reads rep-seqs-or-85.qza
–o-classification taxonomy10C.qza

joined_import_filter_derep:
export HDF5_USE_FILE_LOCKING=’FALSE’;
$(CONDA_ACTIVATE) Miqiime2-2021.8;
qiime vsearch dereplicate-sequences
–i-sequences fil_joined.qza
–o-dereplicated-table table.qza
–o-dereplicated-sequences rep-seqs.qza

Couls you please help me?

Thanks a lot

I specify I did not check at the moment if primers for sequencing have changed or so

Michela

I would appr4ciate very much you kind help.




1 Like

Hello Michela,

The information about primers would be crucial. It is the most possible explanation for the poor performance of the classifier.

Cheers
Valentyn

Hi Valentyn, I will be back with information about primers, for sure I would need indications on waht classifier would be best fitted.

Thanks a lot for you support

Michela

After you obtain primer sequences you can refer to a tutorial on building a reference database here:

Cheers
Valentyn



1 Like

Thanks a lot!
could you please remind me of which primers are compatible with this classifier database
gg-13-8-99-nb-classifier.qza

?
This will help me very much to recontruct the sudden impossibility of classification starting from the same facility.

I would appreciate it very much

Michela

More details are avaiable on the data resources page.

Naive Bayes classifiers trained on:

gg-13-8-99-nb-classifier.qza is the first one. So, no primers are used to select a region at all so the full 16S region is used for k-mer profiling and classification.

Using RESCRIPt to build a database for just your region of interest should perform better because it’s more specific.

Read more here: Source link