Metabolites and lipids descriptive statistical analysis in R
The density plot was already presented in the previous subchapter. Density plots are used to depict the distribution of a numeric variable. Every density plot can be viewed as a derivative of a histogram. Therefore, its applications are similar to those of a histogram. Density plots can be an effective way of visualizing concentration distributions of lipids or metabolites in biological materials. Check out the selected example published in Nature:
T. Takeuchi et al. Gut microbial carbohydrate metabolism contributes to insulin resistance. DOI: https://doi.org/10.1038/s41586-023-06466-x - Fig. 1d & 2b, f (the authors of a manuscript published in Nature used density plots, e.g., for presenting and comparing fecal levels of monosaccharides across experimental groups (as decimal logarithm), or HOMA-IR, BMI, triglycerides (TG) and HDL-C levels among the participant clusters).
Preparing density plots via DataExplorer (level: basic)
DataExplorer enables producing a simple density plot through plot_density():
Basic plot customization was performed using theme_config argument. We delivered a list with parameters to be changed, e.g., we removed the classic ggplot2 gray background using panel.backgroud and setting it element_blank():
Then, we changed the x-y axes color to black through axis.line.x and axis.line.y:
Finally, we changed the colors of the density plot curves to 'royalblue or 'red2', and linewidth to 1, using geom_density_args and a list containing information about a color and linewidth:
We obtain the following plots:
Density plots obtained through plot_density() from the DataExplorer.
Such simple plots can be used for a quick data inspection.
Preparing density plots via ggpubr (level: intermediate)
The ggpubr package contains function ggdensity(). We will show you how to prepare a density plot for a single lipid and multiple lipids:
The plots:
Density plots prepared using ggdensity() function from the ggpubr package.
Preparing density plots via ggplot2 (level: advanced)
As you know from the previous example with histograms, we can add a layer with a density plot using geom_density(). Take a look at the examples below - for a single lipid and multiple lipids. To customize the plot a bit more, we added mean values to them:
The output:
Density plots obtained using ggplot2 library.
It is possible to plot one density graph up, while the other down. Such a chart is known as a mirror density chart. This plot was inspired by an excellent source of ideas for beautiful R charts - R graph gallery:
We can modify the code to obtain mirror density plots for multiple lipids at once:
We obtain the following plot:
Multiple mirror density plots for multiple SM species.
We can also add annotations to each plot. Annotations must be delivered through geom_text() as a data frame with labels and coordinates if multiple panels are created through facet_grid(). Here, we will use exemplary annotations: "Control - N" and "Patient - T" :
We obtain finally:
Multiple mirror density plots with annotations.
As annotations, one could add, for example, some additional clinical data, the exact value of mean and median concentration, etc.
# Changing color of density plot curves:
plot_density(geom_density_args = list(color = 'red2', linewidth = 1))
# Calling library
library(ggpubr)
# Reading the documentation about the function:
?ggdensity()
# PLOT A: Density plots for a single lipid (through wide tibble):
data %>%
filter(Label != 'PAN') %>%
select(`Label`,`SM 41:1;O2`) %>%
ggdensity(x = "SM 41:1;O2", # As x - you put name of the selected column
color = "Label",
fill = "Label",
rug = T,
palette = c('royalblue', 'red2'))
# PLOT B: Density plots for multiple lipids (through long tibble):
data %>%
filter(Label != 'PAN') %>%
select(`Label`,
`SM 39:1;O2`,
`SM 40:1;O2`,
`SM 41:1;O2`,
`SM 42:1;O2`) %>%
pivot_longer(cols = `SM 39:1;O2`:`SM 42:1;O2`,
names_to = 'Lipids',
values_to = 'Concentrations') %>%
ggdensity(x = "Concentrations", # As x - you put column containing concentrations.
color = "Label",
fill = "Label",
rug = T,
palette = c('royalblue', 'red2')) +
facet_grid(. ~ Lipids, scales = "free_x")