if (length(which(installed_packages == "GPArotation"))==0){

install.packages("GPArotation")

}

if (length(which(installed_packages == "VennDiagram"))==0){

install.packages("VennDiagram")

}

rm(installed_packages)

require(ropls)

require(Matrix)

require(pracma)

require(GPArotation)

require(stats)

require(VennDiagram)

If some packages are not yet installed in your R environment, they should be automatically downloaded. Then required packages (GPArotation, Matrix, pracma, ropls, stats, VennDiagram) will be loaded.

<h2>Load the data</h2>

Example data is provided here: https://gitlab.univ-nantes.fr/bertrand-s-1/pochermon/blob/master/data/Data_Co-Culture2.csv

The data should be formatted with sample in lines and variables in columns.

A list of sample type need to be created to be able to easily identify which sample correspond to both pure metabolome (here “Fungi1” and “Fungi2”) and to the mixed metabolome (“Fungi1VSFungi2”). In this case, the “rowheader” list is created from the 1st columns which contains sample type.

`rowheader<-data[,1]`

Then, the first column of the data matrix is removed.

`data<-data[,2:dim(data)[2]]`

This data matrix contains 728 variables and 18 samples (6 replicates of each sample type).

Univariate data analysis with PocheRmon

The data analysis strategy corresponds to the direct comparison of samples from pure metabolomes with the mixed metabolome separately. In the cases presented here, “Fungi1” is compared to “Fungi1VSFungi2” and “Fungi2” is compared to “Fungi1VSFungi2” separately. Then the two data analysis are merged based on where the variable is the most intense between pure metabolomes. Statistical significance can be achieved either using Wilcoxon test or Student test.

The function provide various results as follow. First a graphical representation of the results as a volcano plot.

Then two result tables a provided. One corresponding to the selected features called “result$featuresSelection” and the results for all variables called “result$Ttest” or “result$Wtest” depending on the statistical significance test demanded. All those tables are structures as follow:

<table>

<tr>

<td></td>

<th>var<sub>1</sub></th>

<th>var<sub>2</sub></th>

<th>...</th>

<th>var<sub>i</sub></th>

<th>...</th>

</tr>

<tr>

<th>Ttest.FoldChange</th>

<td></td>

<td></td>

<td></td>

<td></td>

<td></td></tr>

<tr>

<th>Ttest.p-value</th>

<td></td>

<td></td>

<td></td>

<td></td>

<td></td></tr>

</table>

The variables represented in pink in the volcano plot are those listed in the “result$featuresSelection” table. The parameters used for features selection can be modified within the function arguments “lim.FC” and “lim.pVal”.