# Effect size computation and interpretation

Thus far we have been focused on determining **the statistical significance**, namely the *p*-value. We investigated if a difference exists between two or more biological groups. At this step, it is important to mention that the statistical significance is influenced by the collected sample sizes. Hence, if sample sizes are large even a small difference between two groups can be found statistically significant.

Now, we would like to show you an important measure that should be reported together with statistical significance - it is called **effect size**. The effect size is the magnitude of a difference between biological groups, showing if this effect is large enough to be meaningful, e.g., useful for further investigation. It is difficult to make such a judgment using p-values only, considering the influence of sample sizes on statistical significance. In the effect, as mentioned above, many journals expect to report effect sizes together with the p-values.&#x20;

Here, you will find information about how effect size can be computed in R using rstatix or ggstatsplot libraries and what effect size can be used with statistical tests from the previous subchapters.

## Selecting effect size for basic statistical tests

Below you will find exemplary effect sizes, which can be computed and reported together with the *t*-test, Mann-Whitney *U* test, ANOVA, and Kruskal-Wallis test:

<table><thead><tr><th width="205">Statistical test</th><th width="200">Effect size</th><th>Example of effect size interpretation </th></tr></thead><tbody><tr><td><em>t</em>-test</td><td><p>1 - Cohen’s d, </p><p>2 - Hedges’ g</p></td><td>0.2 - small effect, 0.5 - medium effect, 0.8 and more - large effect</td></tr><tr><td>Mann-Whitney <em>U</em> test</td><td>1 - r value<br>2 - Rank biserial correlation</td><td>For r value: <br>&#x3C;0.3 small effect, 0.5 - moderate effect, 0.5 and more - large effect<br><br>For rank biserial correlation: <br>-1 - perfect negative relationship, 0 - no effect, 1 - perfect positive relationship</td></tr><tr><td>ANOVA</td><td>1 - η2 Eta Squared</td><td>0.01 - small effect size, 0.06 - medium effect size, 0.14 and more - large effect size</td></tr><tr><td>Kruskal-Wallis test</td><td>1 - ε2 Epsilon Squared<br>2 - η2 Eta Squared</td><td>0.01 - small effect size, 0.06 - medium effect size, 0.14 and more - large effect size</td></tr></tbody></table>

## Computing effect size in R

The rstatix library contains dedicated functions for computing effect sizes. You will find examples in the code blocks below:

```r
# Computing effect size in R with rstatix.
# Cohen's d effect size for t-test - via cohens_d().
# Documentation:
?cohens_d()

# Computing effect size:
Cohens.d <- 
  data %>%
  select(-`Sample Name`) %>%
  pivot_longer(cols = `CE 16:1`:`SM 41:1;O2`,
               names_to = "Lipids",
               values_to = "Concentrations") %>%
  group_by(Lipids) %>%
  cohens_d(Concentrations ~ Label, 
           ref.group = "N")

print(Cohens.d)
```

We obtain:

<figure><img src="/files/ZUOE0dodsif5XdimB01j" alt=""><figcaption><p>Cohen's d effect size computed through cohens_d() function from the rstatix library.</p></figcaption></figure>

```r
# Computing effect size in R with rstatix.
# r value effect size for Mann-Whitney U test - via wilcox_effsize().
# Documentation:
?wilcox_effsize()

# Installing additional libraries needed:
install.packages("coin")

# Note - the function may need additional packages - here, coin packages was installed.

# Computing effect size:
r.value <- 
  data %>%
  select(-`Sample Name`) %>%
  pivot_longer(cols = `CE 16:1`:`SM 41:1;O2`,
               names_to = "Lipids",
               values_to = "Concentrations") %>%
  group_by(Lipids) %>%
  wilcox_effsize(Concentrations ~ Label, 
           ref.group = "N")

print(r.value)
```

We obtain the following tibble:

<figure><img src="/files/9ourxVf6BTcAcFK1qmi3" alt=""><figcaption><p>The r-value effect size computed through wilcox_effsize() from the rstatix library.</p></figcaption></figure>

**NOTE:** If you carefully check the tibble with ANOVA test results obtained from the anova\_test() function, you will find the last column named **'ges'**, which stands for **generalized eta squared**. It is the effect size computed automatically:

```r
# Computing effect size in R with rstatix.
# Eta squared effect size for ANOVA:
ANOVA <- 
  data %>%
  select(-`Sample Name`) %>%
  pivot_longer(cols = `CE 16:1`:`SM 41:1;O2`,
               names_to = "Lipids",
               values_to = "Concentrations") %>%
  group_by(Lipids) %>%
  anova_test(Concentrations ~ Label)

print(ANOVA)
```

Here is the exemplary tibble:

<figure><img src="/files/0R6BlMaw2JyqJJEFXg5i" alt=""><figcaption><p>The generalized eta squared effect size computed automatically by anova_test() from the rstatix package.</p></figcaption></figure>

The package also provides a function called **eta\_squared()**. Using it, you can compute the effect size for the base ANOVA model built through the aov() function, for instance:

```r
# Using the eta_squared() function from the rstatix library:
aov <- aov(`SM 41:1;O2` ~ Label, data)
eta_squared(aov)
```

The output in the R console:

```r
> aov <- aov(`SM 41:1;O2` ~ Label, data)
> eta_squared(aov)
    Label 
0.3602876 
```

You will find exactly the same value in the 'ges' column of tibble with the ANOVA test results.

```r
# Computing effect size in R with rstatix.
# Eta squared effect size for Kruskal-Wallis test - via kruskal_effsize().
# Documentation:
?kruskal_effsize()

# Computing effect size:
KW.effect.size <- 
  data %>%
  select(-`Sample Name`) %>%
  pivot_longer(cols = `CE 16:1`:`SM 41:1;O2`,
               names_to = "Lipids",
               values_to = "Concentrations") %>%
  group_by(Lipids) %>%
  kruskal_effsize(Concentrations ~ Label)
```

We obtain:

<figure><img src="/files/L09RBbleMqKj49xahsj5" alt=""><figcaption><p>The eta squared based on the H-statistic - effect size for the Kruskal-Wallis test computed through kruskal_effsize() from the rstatix package.</p></figcaption></figure>

Effect sizes were also automatically computed by the ggstatsplot library. Look at the examples below:

```r
# Creating violin box plots with statistical annotations (ggstatsplot).

# Plot 1:
Welch <-
  data %>%
  select(`Label`,
         `SM 41:1;O2`) %>%
  filter(Label != "PAN") %>%
  ggbetweenstats(x = Label, y = "SM 41:1;O2", type = 'parametric') +
  scale_color_manual(values = c("royalblue", "red2"))
  
# Plot 2:
MW <-
  data %>%
  select(`Label`,
         `SM 41:1;O2`) %>%
  filter(Label != "PAN") %>%
  ggbetweenstats(x = Label, y = "SM 41:1;O2", type = 'nonparametric') +
  scale_color_manual(values = c("royalblue", "red2"))
  
# Plot 3:
ANOVA <-
  data %>%
  select(`Label`,
         `SM 41:1;O2`) %>%
  ggbetweenstats(x = Label, y = "SM 41:1;O2", type = 'parametric') +
  scale_color_manual(values = c("royalblue", "orange", "red2"))
  
# Plot 4:
KW <-
  data %>%
  select(`Label`,
         `SM 41:1;O2`) %>%
  ggbetweenstats(x = Label, y = "SM 41:1;O2", type = 'nonparametric') +
  scale_color_manual(values = c("royalblue", "orange", "red2"))
  
# Creating a list of plots:
list <- list(Welch, MW, ANOVA, KW)

# Combining all plots into one image:
combine_plots(list)
```

We obtain:

<figure><img src="/files/ps2oQKwHtGphX3p9elGe" alt=""><figcaption><p>The ggstatsplot box plots with detailed statistical annotations. In red frames - effect sizes computed automatically by the ggbetweenstats() function. Here, for ANOVA the omega squared effect size was proposed.</p></figcaption></figure>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://laboratory-of-lipid-metabolism-a.gitbook.io/omics-data-visualization-in-r-and-python/metabolites-and-lipids-univariate-statistics-in-r/effect-size-computation-and-interpretation.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
