💪
Omics data visualization in R and Python
  • Introduction
    • From Authors
    • Virtual environments - let's begin
    • Getting started with Python
    • Getting started with R
    • Example data sets
  • PERFORMING FUNDAMENTAL OPERATIONS ON OMICs DATA USING R
    • Fundamental data structures
    • Loading data into R
    • Preferred formats in metabolomics and lipidomics analysis
    • Preprocess data type using Tidyverse package
    • Useful R tricks and features in OMICs mining
      • Application of pipe (%>%) functions
      • Changing data frames format with pivot_longer()
      • Data wrangling syntaxes useful in OMICs mining
      • Writing functions in R
      • The 'for' loop in R (advanced)
  • PERFORMING FUNDAMENTAL OPERATIONS ON OMICs DATA USING PYTHON
    • Fundamental data structures
    • Loading data into Python
  • Missing values handling in R
    • Missing values – Introduction
    • Detecting missing values (DataExplorer R package)
    • Filtering out columns containing mostly NAs
    • Data imputation by different available R libraries
      • Basic data imputation in R with dplyr and tidyr (tidyverse)
      • Data imputation using recipes library (tidymodels)
      • Replacing NAs via k-nearest neighbor (kNN) model (VIM library)
      • Replacing NAs via random forest (RF) model (randomForest library)
  • Missing values handling in Python
    • Detecting missing values
    • Filtering out columns containing mostly NAs
    • Data imputation
  • Data transformation, scaling, and normalization in R
    • Data normalization in R - fundamentals
    • Data normalization to the internal standards (advanced)
    • Batch effect corrections in R (advanced)
    • Data transformation and scaling - introduction
    • Data transformation and scaling using different available R packages
      • Data transformation and scaling using mutate()
      • Data transformation and scaling using recipes R package
      • Data Normalization – bestNormalize R package
  • Data transformation, scaling, and normalization in Python
    • Data Transformation and scaling in Python
  • Metabolites and lipids descriptive statistical analysis in R
    • Computing descriptive statistics in R
    • Using gtsummary to create publication-ready tables
    • Basic plotting in R
      • Bar charts
      • Box plots
      • Histograms
      • Density plots
      • Scatter plots
      • Dot plots with ggplot2 and tidyplots
      • Correlation heat maps
    • Customizing ggpubr and ggplot2 charts in R
    • Creating interactive plots with ggplotly
    • GGally for quick overviews
  • Metabolites and lipids descriptive statistics analysis in Python
    • Basic plotting
    • Scatter plots and linear regression
    • Correlation analysis
  • Metabolites and lipids univariate statistics in R
    • Two sample comparisons in R
    • Multi sample comparisons in R
    • Adjustments of p-values for multiple comparisons
    • Effect size computation and interpretation
    • Graphical representation of univariate statistics
      • Results of tests as annotations in the charts
      • Volcano plots
      • Lipid maps and acyl-chain plots
  • Metabolites and lipids univariate statistical analysis in Python
    • Two sample comparisons in Python
    • Multi-sample comparisons in Python
    • Statistical annotations on plots
  • Metabolites and lipids multivariate statistical analysis in R
    • Principal Component Analysis (PCA)
    • t-Distributed Stochastic Neighbor Embedding (t-SNE)
    • Uniform Manifold Approximation and Projection (UMAP)
    • Partial Least Squares (PLS)
    • Orthogonal Partial Least Squares (OPLS)
    • Hierarchical Clustering (HC)
      • Dendrograms
      • Heat maps with clustering
      • Interactive heat maps
  • Metabolites and lipids multivariate statistical analysis in Python
    • Principal Component Analysis
    • t-Distributed Stochastic Neighbor Embedding
    • Uniform Manifold Approximation and Projection
    • PLS Discriminant Analysis
    • Clustered heatmaps
  • OMICS IN MACHINE LEARNING APPROACHES IN R AND PYTHON
    • Application of selected models to OMICs data
    • OMICs machine learning – Examples
  • References
    • Library versions
Powered by GitBook
On this page
  • Practical applications of multi sample comparisons (examples)
  • ANOVA test in R
  • Kruskal-Wallis test in R
  • Post hoc tests in R
  • Pairwise comparisons with PMCMRplus in R
  • PMCMRplus comparisons for multiple variables (advanced)
  1. Metabolites and lipids univariate statistics in R

Multi sample comparisons in R

Metabolites and lipids univariate statistical analysis in R

PreviousTwo sample comparisons in RNextAdjustments of p-values for multiple comparisons

Last updated 2 months ago

Multi-sample comparisons involve parametric ANOVA and non-parametric Kruskal-Wallis tests for comparing more than two samples simultaneously. If pairwise comparisons are necessary, then post hoc tests are performed.

Practical applications of multi sample comparisons (examples)

Multi sample comparisons are typically used to analyze lipid or metabolite levels across three or more unrelated experimental groups, among other applications. See the selected examples below:

1) ANOVA with post hoc tests:

  • V. de Laat et al. Intrinsic temperature increase drives lipid metabolism towards ferroptosis evasion and chemotherapy resistance in pancreatic cancer. DOI: - Fig. 1a & d (the one-way ANOVA with Šídák’s multiple comparisons test was used several times in this manuscript (more examples), e.g., in Fig. 1h for comparing proliferation as measured by BrdU incorporation for PANC-1 or BxPC3 cells, for three different treatments).

  • WC. Wang et al. Metabolomics facilitates differential diagnosis in common inherited retinal degenerations by exploring their profiles of serum metabolites. DOI: (ANOVA with Tukey’s honestly significant difference (HSD) post hoc test is used, e.g., for identification of 40 metabolites with the most significant differences among the seven IRD subgroups and control group - Fig. 2).

  • E. Rysman et al. De novo Lipogenesis Protects Cancer Cells from Free Radicals and Chemotherapeutics by Promoting Membrane Lipid Saturation. DOI: (ANOVA with Tukey's multiple comparison test was used to compare cell counts in the case of four different treatments—see Fig. 4A & B).

2) Kruskal-Wallis with post hoc tests:

  • J. Wu et al. Lipidomic signatures align with inflammatory patterns and outcomes in critical illness. DOI: (the authors use the Kruskal-Wallis test with Dunn's post hoc analysis, as seen in Fig. 1e, to compare circulating total lipid class concentrations between healthy controls and trauma patients; more examples in the manuscript).

  • R. Jirásko et al. Altered Plasma, Urine, and Tissue Profiles of Sulfatides and Sphingomyelins in Patients with Renal Cell Carcinoma. DOI: (the comparison of plasma and urine sphingolipid levels among healthy controls, patients with early and advanced-stage kidney cancer via Kruskal-Wallis with Conover post hoc test).

  • W. Xiao et al. Lipid metabolism of plasma-derived small extracellular vesicles in COVID-19 convalescent patients. DOI: (the authors use the Kruskal-Wallis test for comparing total lipids, total DG, and total PC levels across four experimental groups, no post hoc test applied - Fig. 2C - E; more examples in the manuscript).

ANOVA test in R

The ANOVA test for comparing more than two unpaired samples is computed through the anova_test() function from the rstatix package. The function, except for the classic independent measures ANOVA, enables also computing:

  • repeated measures ANOVA,

  • mixed ANOVA or split-plot ANOVA,

  • AR. Jirásko et al. Altered Plasma, Urine, and Tissue Profiles of Sulfatides and Sphingomyelins in Patients with Renal Cell Carcinoma. DOI: NCOVA test.

However, we will not discuss these additional tests here. We encourage you to consult the documentation of anova_test() for further information:

# Calling library:
library(rstatix)

# Reading the function documentation:
?anova_test()

A similar data inspection as for the t-test can be performed before ANOVA because the null hypothesis of ANOVA concerns differences in means. A Kruskal-Wallis test can be performed if sample distributions are skewed, and means are not representative values. Look at Glimpse at data before the hypothesis testing in the Two sample comparisons in R subchapter for more information about the data inspection.

The ANOVA test will again require the formula we used before:

numeric variable ~ grouping variable

For a single lipid, we use the following code:

anova_test(data, `SM 41:1;O2` ~ Label, detailed = T)

The output in the console:

ANOVA Table (type II tests)

  Effect      SSn      SSd DFn DFd      F        p p<.05  ges
1  Label 1424.648 2529.549   2 224 63.079 1.86e-22     * 0.36

The detailed test results contain sums of squares (SSn, SSd), the degrees of freedom (DFn, DFd), the F-value, the p-value which is significantly below the threshold (0.05), and the squared eta effect size of 0.36, which is considered large. The asterisk symbols are used to mark the p-value < 0.05 threshold (the alternative hypothesis is accepted).

For all numeric variables, we need to obtain a long tibble first, and then group the variables by lipid species. We store the obtained tidy tibble with results as "ANOVA.test.res":

# ANOVA test for all numeric variables in the data set:
ANOVA.test.res <-
  data %>%
  select(where(is.numeric), 
         `Label`) %>%
  pivot_longer(cols = `CE 16:1`:`SM 42:1;O2`,
               names_to = "Lipids",
               values_to = "Concentrations") %>%
  group_by(Lipids) %>%
  anova_test(Concentrations ~ Label)
  
print(ANOVA.test.res)

The output:

Kruskal-Wallis test in R

The Kruskal-Wallis test relies on similar assumptions as the Mann-Whitney U test. It extends in fact the Mann-Whitney U test for comparing three or more groups. It is a non-parametric test comparing mean ranks. The Kruskal-Wallis test is performed using the kruskal_test() function from the rstatix package. The function needs the data frame with our concentrations, and the formula:

numeric variable ~ grouping variable

Here are examples of code:

# Reading the documentation of the Kruskal-Wallis function:
?kruskal_test()

# Using the Kruskal-Wallis test for one selected lipid:
kruskal_test(data, `SM 41:1;O2` ~ Label)

The output in the R console:

# A tibble: 1 × 6
  .y.            n statistic    df        p method        
* <chr>      <int>     <dbl> <int>    <dbl> <chr>         
1 SM 41:1;O2   227      89.3     2 4.01e-20 Kruskal-Wallis

If we want to apply the Kruskal-Wallis test to all numeric variables, we need to use a long tibble. After obtaining the long tibble (via pivot_longer()), we group entries by lipid species - column Lipids. The new column Concentrations storing all numeric values is then used in the formula with the Label column for computing the Kruskal-Wallis test. Look at the example below:

# Using the Kruskal-Wallis test for all numeric variables:
KW.test.res <-
  data %>%
  select(where(is.numeric), 
         `Label`) %>%
  pivot_longer(cols = `CE 16:1`:`SM 42:1;O2`,
               names_to = "Lipids",
               values_to = "Concentrations") %>%
  group_by(Lipids) %>%
  kruskal_test(Concentrations ~ Label)

print(KW.test.res)

The output:

Post hoc tests in R

Using rstatix library, we can perform basic pairwise tests, including:

  • Tukey Honestly Significant Difference (Tukey HSD) (normally distributed data with equal variances),

  • Games-Howell test (if groups are characterized by different variances - homogeneity of variances is violated),

  • Dunn test (nonparametric post hoc test).

The Tukey HSD and Games-Howell test are computed following ANOVA (parametric analysis), while the Dunn test is performed following the Kruskal-Wallis test (nonparametric analysis).

Through pairwise comparisons, we find the differing pairs of groups. Below, you will find examples of code:

# Tukey HSD for one lipid:
tukey_hsd(data, `SM 41:1;O2` ~ Label)

We obtain:

# A tibble: 3 × 9
  term  group1 group2 null.value estimate conf.low conf.high    p.adj p.adj.signif
* <chr> <chr>  <chr>       <dbl>    <dbl>    <dbl>     <dbl>    <dbl> <chr>       
1 Label N      PAN             0    0.357    -1.55      2.26 8.98e- 1 ns          
2 Label N      T               0   -4.95     -6.05     -3.84 0        ****        
3 Label PAN    T               0   -5.30     -7.19     -3.41 7.75e-10 **** 
# Games Howell test for one lipid:
games_howell_test(data, `SM 41:1;O2` ~ Label)

The output:

# A tibble: 3 × 8
  .y.        group1 group2 estimate conf.low conf.high     p.adj p.adj.signif
* <chr>      <chr>  <chr>     <dbl>    <dbl>     <dbl>     <dbl> <chr>       
1 SM 41:1;O2 N      PAN       0.357    -2.02      2.74 0.927     ns          
2 SM 41:1;O2 N      T        -4.95     -6.04     -3.86 0         ****        
3 SM 41:1;O2 PAN    T        -5.30     -7.64     -2.97 0.0000213 **** 
# Dunn test for one lipid:
dunn_test(data, `SM 41:1;O2` ~ Label, p.adjust.method = 'none')

The output:

# A tibble: 3 × 9
  .y.        group1 group2    n1    n2 statistic        p    p.adj p.adj.signif
* <chr>      <chr>  <chr>  <int> <int>     <dbl>    <dbl>    <dbl> <chr>       
1 SM 41:1;O2 N      PAN       97    21    0.0936 9.25e- 1 9.25e- 1 ns          
2 SM 41:1;O2 N      T         97   109   -8.97   3.07e-19 3.07e-19 ****        
3 SM 41:1;O2 PAN    T         21   109   -5.35   8.98e- 8 8.98e- 8 **** 

All three tests can be computed for multiple variables through long tibbles:

# Tukey HSD for multiple lipids (long tibble):
Tukey.res <-
  data %>%
  select(where(is.numeric), 
         `Label`) %>%
  pivot_longer(cols = `CE 16:1`:`SM 42:1;O2`,
               names_to = "Lipids",
               values_to = "Concentrations") %>%
  group_by(Lipids) %>% 
  tukey_hsd(Concentrations ~ Label)
  
print(Tukey.res)

We obtain the following tibble:

The code below allows for computing the Games-Howell test for multiple variables simultaneously:

# Games-Howell test for multiple lipids (long tibble):
GH.res <-
  data %>%
  select(where(is.numeric), 
         `Label`) %>%
  pivot_longer(cols = `CE 16:1`:`SM 42:1;O2`,
               names_to = "Lipids",
               values_to = "Concentrations") %>%
  group_by(Lipids) %>% 
  games_howell_test(Concentrations ~ Label)

print(GH.res)

The results:

Similarly for the Dunn post hoc test, we use:

# Dunn test for multiple lipids (long tibble):
Dunn.res <-
  data %>%
  select(where(is.numeric), 
         `Label`) %>%
  pivot_longer(cols = `CE 16:1`:`SM 42:1;O2`,
               names_to = "Lipids",
               values_to = "Concentrations") %>%
  group_by(Lipids) %>% 
  dunn_test(Concentrations ~ Label, p.adjust.method = 'none') # For now, no adjustment of p-value.

print(Dunn.res)

The results:

Pairwise comparisons with PMCMRplus in R

If you are looking for different types of tests for pairwise comparisons, the PMCMRplus R package is the solution that you need. Below, you will find the vignette of this package:

Here, we will show you one simple example of how the PMCMRplus package can be used. Take a look at the code below:

# First, let's fit and compute the ANOVA using base R functions:
model <- aov(`SM 41:1;O2` ~ Label, data)
anova(model)

We obtain in the console:

Analysis of Variance Table

Response: SM 41:1;O2
           Df Sum Sq Mean Sq F value    Pr(>F)    
Label       2 1424.7  712.32  63.079 < 2.2e-16 ***
Residuals 224 2529.6   11.29                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Now, for the model object, we could compute, for instance, the Fisher's Least Significant Difference Test (Fisher's LSD), assuming the normal distribution of all samples and that variances are similar across groups:

# Fisher's LSD test via PMCMRplus package:
lsdTest(model)

We obtain in the console:

Pairwise comparisons using Fisher's Least Significant Difference Test

data: SM 41:1;O2 by Label

    N       PAN    
PAN 0.66    -      
T   < 2e-16 2.6e-10

P value adjustment method: none
alternative hypothesis: two.sided

Based on ANOVA, we find that the concentration of SM 41:1;O2 is significantly different in at least one group of the three groups (N, PAN, T) from the overall mean. Hence, we accept the alternative hypothesis of ANOVA. Using Fisher's LSD test, we detect that significant differences occur between controls (N) and patients with PDAC (T), but also between patients with pancreatitis (PAN) and patients with PDAC (T). No differences in SM 41:1;O2 levels were found between the healthy controls (N) and patients with pancreatitis (PAN).

After analyzing all samples' distribution, we already know that the mean for many of the lipid species was not the most representative value of biological groups. Distributions of samples were skewed. Also, for most lipids, we found differences in variances between the compared groups. The non-parametric Kruskal-Wallis test could be applied in this case:

# Computing the Kruskal-Wallis test from the PMCMRplus:
kruskalTest(`SM 41:1;O2` ~ Label, data = data, dist = "Chisquare")

We obtain in the console:

	Kruskal-Wallis test

data:  SM 41:1;O2 by Label
chi-squared = 89.328, df = 2, p-value < 2.2e-16

Next, we can use a nonparametric Conover test for pairwise comparisons:

# Performing Conover test in R via PMCMRplus:
Conover.test <- 
  kwAllPairsConoverTest(`SM 41:1;O2` ~ Label, 
                              data = data, 
                              p.adjust.method = "none")
                              
# Printing the results of the test:
summary(Conover.test)

And we obtain:

	Pairwise comparisons using Conover's all-pairs test

data: SM 41:1;O2 by Label
P value adjustment method: none
H0
             t value   Pr(>|t|)    
PAN - N == 0   0.120    0.90472    
T - N == 0   -11.479 < 2.22e-16 ***
T - PAN == 0  -6.844 7.2764e-11 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The results of this test are similar to the ANOVA - based on the Kruskal Wallis test, we know that at least one sample comes from a different population than others (simplifying the alternative hypothesis). From the Conover all-pairs test, we know that differences occur between healthy controls (N) and patients with PDAC (T), and controls (N) and patients with pancreatitis (PAN). Also, no differences between controls (N) and patients with pancreatitis (PAN) were found.

PMCMRplus comparisons for multiple variables (advanced)

Here, we will show you how to loop the PMCMRplus functions on numeric columns of your tibble and obtain a tibble with the outcomes of a selected post hoc test. This material is advanced, and it may be complicated for beginners. If you focus only on basic analysis and visualization of lipidomics and metabolomics data, you skip this subchapter.

As you know from above, PMCMRplus is a complete solution for pairwise testing in R. The PMCMRplus package also contains a function toTidy(), which builds a tidy data frame from the output of many PMCMRplus functions. Using for() loop in R, we can perform the post hoc test for every numeric column and, using toTidy(), store the results from tests in the list. See the code below:

# for() loop with PMCMRplus post hoc test.
# Calling libraries:
library(PMCMRplus)
library(tidyverse)

# ANOVA LSD post hoc 
# Create a list of lists with results:
results.LSD <- list()

# Before calculating the LSD posthoc, ensure all column types are adjusted!
# Important: 
# 1) `Sample Name` must be <chr>, 
# 2) `Label` must be a <fct>,
# 3) All lipid concentrations should be numeric.

# If necessary, use:
# str(data)
# data$`Sample Name` <- as.character(data$`Sample Name`)
# data$Label <- as.factor(data$Label)

# This applies to all code blocks below!

# Defining for() loop to perform Fisher's LSD post hoc:
for (i in 3:129) {
  results.LSD[[i]] <- toTidy(lsdTest(data[[i]], g=data$Label, data = data))
  
}

# Before the loop we define the repeating element 'i' and its range of columns: 3:129.
# We open the loop.
# In the loop, we want to create a list of lists results.LSD[[i]].
# Every post hoc outcome will be stored as a new list in results.LSD list.
# The outcome of each test is turned into a data frame using toTidy().
# Finally, we define the lsdTest():
# 1) The outcomes to be compared as data[[i]],
# 2) A grouping variable - data$Label, 
# 3) And a data frame to perform the post hoc test on: data = data.

Our post hoc test results are now stored as a list of lists named 'results.LSD'. We will change it into a tibble with all post hoc results:

# Creating a tibble from 'results.LSD'.
# Install and call library data.table:
install.packages('data.table')
library(data.table)

# First, using purrr package from tidyverse we need to change lists into data tables:
data_table_list <- map(results.LSD, as.data.table)

# Using rbindlist(), we create a data frame:
data.frame.LSD <- rbindlist(data_table_list, fill = TRUE, idcol = T)

# We change the data frame into a tibble:
posthoc.LSD <- as_tibble(data.frame.LSD)

# Now we can print it:
print(posthoc.LSD)

And we obtain this tibble with all posthoc results:

We need to add lipid species to this tibble:

# Adding lipid species to posthoc.LSD tibble:
# First, we separate all names of numeric columns from 'data' data frame.
# We store them as tibble "Lipids":
Lipids <- as_tibble(colnames(data[,c(3:129)]))

# As we perform posthoc on three groups, every three entries in posthoc.LSD are the same lipid.
# We multiply every entry in "Lipids" three times using slice() with rep():
Lipids <- 
  Lipids %>% 
  slice(rep(1:n(), each = 3))

# We add lipid annotations to the final tibble:
posthoc.LSD$Lipids <- Lipids$value

# NOTE: DO NOT CHANGE THE ORDER - it is the same in posthoc.LSD and Lipids tibbles!!!

# Print the output in the console: 
print(posthoc.LSD)

The final output:

We proceed similarly for Games-Howell and Tukey HSD:

# Games-Howell posthoc test:
results.GHT <- list()

# Defining for() loop to perform Games-Howell post hoc:
for (i in 3:129) {
  results.GHT[[i]] <- toTidy(gamesHowellTest(data[[i]], g=data$Label, data = data))
  
}

# Changing lists into data tables in results.GHT list:
data_table_list <- map(results.GHT, as.data.table)

# Using rbindlist(), we create a data frame:
data.frame.GHT <- rbindlist(data_table_list, fill = TRUE, idcol = T)

# We change the data frame into a tibble:
posthoc.GHT <- as_tibble(data.frame.GHT)

# Adding lipid species to posthoc.GHT tibble:
Lipids <- as_tibble(colnames(data[,c(3:129)]))

Lipids <- 
  Lipids %>% 
  slice(rep(1:n(), each = 3))

# We add lipid annotations to the final tibble:
posthoc.GHT$Lipids <- Lipids$value

# Printing the results:
print(posthoc.GHT)

The output:

# Tukey HSD posthoc test:
results.HSD <- list()

# Defining for() loop to perform Tukey HSD post hoc:
for (i in 3:129) {
  results.HSD[[i]] <- toTidy(tukeyTest(data[[i]], g=data$Label, data = data))
  
}

# Changing lists into data tables:
data_table_list <- map(results.HSD, as.data.table)

# Using rbindlist(), we create a data frame:
data.frame.HSD <- rbindlist(data_table_list, fill = TRUE, idcol = T)

# We change the data frame into a tibble:
posthoc.HSD <- as_tibble(data.frame.HSD)

# Adding lipid species to posthoc.HSD tibble:
Lipids <- as_tibble(colnames(data[,c(3:129)]))

Lipids <- 
  Lipids %>% 
  slice(rep(1:n(), each = 3))

# We add lipid annotations to the final tibble:
posthoc.HSD$Lipids <- Lipids$value

# Printing the results:
print(posthoc.HSD)

The output:

We will also proceed similarly for non-parametric post hoc tests used after the Kruksal-Wallis test, i.e., Dunn, Conover, and Nemenyi posthoc tests:

# Dunn posthoc test:
results.Dunn <- list()

# Defining for() loop to perform Dunn post hoc:
for (i in 3:129) {
  results.Dunn[[i]] <- toTidy(kwAllPairsDunnTest(data[[i]], g=data$Label, data = data, 
                                                 p.adjust.method = 'none'))
  
}

# Changing lists into data tables:
data_table_list <- map(results.Dunn, as.data.table)

# Using rbindlist(), we create a data frame:
data.frame.Dunn <- rbindlist(data_table_list, fill = TRUE, idcol = T)

# We change the data frame into a tibble:
posthoc.Dunn <- as_tibble(data.frame.Dunn)

# Adding lipid species to posthoc.Dunn tibble:
Lipids <- as_tibble(colnames(data[,c(3:129)]))

Lipids <- 
  Lipids %>% 
  slice(rep(1:n(), each = 3))

# We add lipid annotations to the final tibble:
posthoc.Dunn$Lipids <- Lipids$value

# Printing the results:
print(posthoc.Dunn)

The output:

# Conover posthoc test:
results.Conover <- list()

# Defining for() loop to perform Conover post hoc:
for (i in 3:129) {
  results.Conover[[i]] <- toTidy(kwAllPairsConoverTest(data[[i]], g=data$Label, data = data, p.adjust.method = 'none'))
  
}

# Changing lists into data tables:
data_table_list <- map(results.Conover, as.data.table)

# Using rbindlist(), we create a data frame:
data.frame.Conover <- rbindlist(data_table_list, fill = TRUE, idcol = T)

# We change the data frame into a tibble:
posthoc.Conover <- as_tibble(data.frame.Conover)

# Adding lipid species to posthoc.Conover tibble:
Lipids <- as_tibble(colnames(data[,c(3:129)]))

Lipids <- 
  Lipids %>% 
  slice(rep(1:n(), each = 3))

# We add lipid annotations to the final tibble:
posthoc.Conover$Lipids <- Lipids$value

# Printing the results:
print(posthoc.Conover)

The output:

# Nemenyi posthoc test:
results.Nemenyi <- list()

# Defining for() loop to perform Nemenyi post hoc:
for (i in 3:129) {
  results.Nemenyi[[i]] <- toTidy(kwAllPairsNemenyiTest(data[[i]], g=data$Label, data = data))
  
}

# Changing lists into data tables:
data_table_list <- map(results.Nemenyi, as.data.table)

# Using rbindlist(), we create a data frame:
data.frame.Nemenyi <- rbindlist(data_table_list, fill = TRUE, idcol = T)

# We change the data frame into a tibble:
posthoc.Nemenyi <- as_tibble(data.frame.Nemenyi)

# Adding lipid species to posthoc.Nemenyi tibble:
Lipids <- as_tibble(colnames(data[,c(3:129)]))

Lipids <- 
  Lipids %>% 
  slice(rep(1:n(), each = 3))

# We add lipid annotations to the final tibble:
posthoc.Nemenyi$Lipids <- Lipids$value

# Printing the results:
print(posthoc.Nemenyi)

The output:

NOTE! If missing values (NA, NaN) are present in the data frame, the functions from rstatix library will drop the rows containing missing values while performing multi sample hypothesis testing.

https://doi.org/10.1038/s41467-024-52978-z
https://doi.org/10.1038/s41467-024-47911-3
https://doi.org/10.1158/0008-5472.CAN-09-3871
https://doi.org/10.1038/s41467-022-34420-4
https://doi.org/10.3390/cancers14194622
https://doi.org/10.1038/s41598-023-43189-5
https://doi.org/10.3390/cancers14194622
Introduction to the PMCMRplus package.
The results of independent measures ANOVA computed using anova_test() from the rstatix package.
The results of Kruskal-Wallis test computed using kruskal_test() from the rstatix package.
Tukey HSD post hoc test results computed using the tukey_hsd() function from rstatix package.
Games Howell post hoc test results computed using the games_howell_test() function from rstatix package.
Dunn post hoc test results computed using the dunn_test() function from rstatix package.
The final tibble with Fisher's LSD post hoc test results. The post hoc test was computed through lsdTest() for-looped across 'data' tibble.
The 'posthoc.LSD' tibble with all results and lipid annotations.
The 'posthoc.GHT' tibble with all results of Games-Howell post-hoc test and lipid annotations.
The 'posthoc.HSD' tibble with all Tukey HSD post-hoc test results and lipid annotations.
The 'posthoc.Dunn' tibble with all Dunn post-hoc test results and lipid annotations.
The 'posthoc.Conover' tibbles with all Conover posthoc test results and lipid annotations.
The 'posthoc.Nemenyi tibble with all Tukey-Kramer-Nemenyi post-hoc test results and lipid annotations.