Feature selection

Feature selection



I am starting out with bulk ATAC data as bed files that include the read counts. I want to use this data for a package called MOFA, which requires these preprocessing steps:

  1. Normalisation: For count-based data such as RNA-seq or ATAC-seq we recommend size factor normalisation + variance stabilisation (i.e. a log transformation).

  2. Feature selection:
    It is strongly recommended that you select highly variable features (HVGs) per assay before fitting the model. This ensures a faster training and a more robust inference procedure. Also, for data modalities that have very different dimensionalities we suggest a stronger feature selection fort he bigger views, with the aim of reducing the feature imbalance between data modalities.

I am finding a lot of information on how to do this with single-cell data in R, but not bulk data. Are there any tutorials for how to do these steps with bulk data?





before adding your answer.

Traffic: 927 users visited in the last hour

Read more here: Source link