Matlab cross-validation in PCA in bioinformatics data analysis – Freelance Job in Quantitative Analysis – Less than 30 hrs/week – undefined

Require help in matlab working in bioinformatic dataset that consists of 3 alignment methods.
1)Clustal, 2)Muscle, and 3) is Mafft
Each method has its own data that consists of Matrix of N X M where the N = number of observations and M = number of Variables.
Clustal = 32 X 149535
Muscle = 32 X 149580
Mafft = 32 X 149580
My main work is to compare between these 3 method using PCA model. Previously, i already applied the pca in Matlab using score_pca and loadings Pca by MEDA toolbox built on matlab. Link is here github.com/josecamachop/MEDA-Toolbox
I implemented the model based only 2 Principle components (2 pcs) but not based on scientific approach of cross-validation to just test our approach is working and to facilitate the visualization.

Now,I need to apply the Prediction error (PRESS) based on column- wise-k fold (ckf) algorithm. ckf is built manually in MEDA toolbox and can be used using these following steps
• 1st preprocess the data
• Calculating the scores
• Calulating the loadings
• Then apply the error prediction PRESS using ckf.
I can show these steps later when discussion.
Thus, I require to automate these 3 manual steps to get for example table with 3 values for each method and then automate also the function used to identify the number of components using threshold. Then if we got 8 pcs for 1st method, 6 pcs for the 2nd method and 5 pcs for the 3rd method. I need to come up with a decision “How can I combine these different results of pcs with one unique value of pcs for the three methods?” ( I do not know the approach that can be used here) for instance we can say we will select 4 pc components for the three methods for this reason ( what is the reason here)?I don’t know how can be calculated manually and then automatically.
However, the previous step of calculation CV press prediction error and identifying no of pcs can be calculated manually and I just need them to be calculated automatically.
I’m considering a fixed cost for this task 60$ for 3-4 days and any small additions or modifications with 30$ more. I just need automate the equations built on MEDA(i have) and define the recommended value for the pc components for the methods. Any more details will be clarified in meeting before starting.
Additionally, i will request in the following week another task related to procrusts pca with another 60$ and 30$ more in any editions.

Read more here: Source link