Volcano plots

Metabolites and lipids univariate statistical analysis in R

Volcano plots are broadly used to present outcomes from hypothesis testing on OMICs data. The interpretation of volcano plots is very straightforward - a volcano plot is nothing else but a subtype of scatter plot presenting statistical significance (p-values) versus a magnitude of change - most frequently fold change. The values in the left and right upper corners contain the lipids or metabolites with the most pronounced differences between the groups. Fold change is usually computed as a ratio of both groups' means. For heavily skewed data, you can also compute the ratio of medians and use a nonparametric test for comparing groups: Mann-Whitney U test for unpaired samples or Wilcoxon rank sum for paired study designs. It is important to understand that the volcano plot allows only presenting outcomes from the two group comparisons (e.g., t-test, Welch t-test, or Mann-Whitney U test).

R offers multiple libraries for producing volcano plots. Here, we will show you a great Bioconductor package called EnhancedVolcano for creating publication-ready volcano plots.

Practical applications of volcano plots in lipidomics and metabolomics (examples)

Volcano plots are one of the most commonly used methods for visualizing differentiating lipids and metabolites between two experimental groups. Explore the following manuscripts that utilize volcano plots:

F. Ottosson et al. Unraveling the metabolomic architecture of autism in a large Danish population-based cohort. DOI: https://doi.org/10.1186/s12916-024-03516-7 - Fig. 1A (Note! This volcano plot presents -log(p-value) vs. odds ratio).
Y. R. J. Jaspers et al. Lipidomic biomarkers in plasma correlate with disease severity in adrenoleukodystrophy. DOI: https://doi.org/10.1038/s43856-024-00605-9 - Fig. 1B, Fig. 2A, Fig. 3A, Fig. 4A, Fig. 5B.
H. Tsugawa et al. A lipidome landscape of aging in mice. DOI: https://doi.org/10.1038/s43587-024-00610-6 - Fig. 5B.
S. Salihovic et al. Identification and validation of a blood- based diagnostic lipidomic signature of pediatric inflammatory bowel disease. DOI: https://doi.org/10.1038/s41467-024-48763-7 - Fig. 2.
N. De Silva Mohotti et al. Lipidomic Analysis Reveals Differences in the Extent of Remyelination in the Brain and Spinal Cord. DOI: https://doi.org/10.1021/acs.jproteome.3c00443 - Fig. 6.
V. de Laat et al. Intrinsic temperature increase drives lipid metabolism towards ferroptosis evasion and chemotherapy resistance in pancreatic cancer. DOI: https://doi.org/10.1038/s41467-024-52978-z - Fig. 1c.

Volcano plots in EnhancedVolcano

Let's begin with the installation of EnhancedVolcano. This produce is slightly different as EnhancedVolcano is a Bioconductor package:

# Installation of EnhancedVolcano (highlight and run all lines):
# Works with R 4.3
if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("EnhancedVolcano")

# Browsing package vignette:
browseVignettes("EnhancedVolcano")

# Consulting the documentation:
?EnhancedVolcano()

The EnhancedVolcano() function requires, as a minimum:

a table with test statistics (argument toptable),
a column in toptable with names of lipids/metabolites (argument lab),
a column in toptable containing log2 fold change (argument x),
and a column in toptable with raw or adjusted p-values from a statistical test (argument y).

Let's create the most basic volcano plot. First, we will compute Welch's t-test and then adjust the p-values using the Benjamini-Hochberg approach. Next, we will obtain mean-based fold change. Finally, we will produce a basic volcano plot through EnhancedVolcano(). Using xlab, ylab, and title, we will add x and y-axis labels and plot title and subtitle. Look at the code block below:

# Call libraries
library(tidyverse)
library(rstatix)

# Step 1 - Create 'data.long' tibble:
data.long <- 
  data %>%
  select(-`Sample Name`) %>%
  filter(Label != "PAN") %>%     # We remove patients with PAN for this visualization!
  droplevels() %>%              # We also drop the <fct> PAN.      
  pivot_longer(cols = `CE 16:1`:`SM 42:1;O2`,
               names_to = "Lipids",
               values_to = "Concentrations")

# Step 2 - Compute Welch's t-test and adjust p-values (rstatix):
Welch.test <- 
  data.long %>%
  group_by(Lipids) %>%
  t_test(Concentrations ~ Label) %>%
  adjust_pvalue(method = "BH")
  
# Step 3 - Compute mean concentrations for all lipids within healthy controls (N):
Lipids.N <- 
  data.long %>%
  group_by(Lipids, Label) %>%
  summarise(mean = mean(Concentrations)) %>%
  arrange(Label) %>%
  filter(Label == "N")
  
# Step 4 - Compute mean concentrations for all lipids within patients with PDAC (T):
Lipids.T <- 
  data.long %>%
  group_by(Lipids, Label) %>%
  summarise(mean = mean(Concentrations)) %>%
  arrange(Label) %>%
  filter(Label == "T")
  
# Step 5 - Compute fold-change and store it in the "Welch.test" as `Fold change`:
Welch.test$`Fold change` <- Lipids.T$mean/Lipids.N$mean

# Step 6 - Using mutate(), compute log2 of fold change:
Welch.test <-
  Welch.test %>%
  mutate(log.2.FC = log2(`Fold change`))
  
# Step 7 - Call EnhancedVolcano package:
library(EnhancedVolcano)

# Step 8 - Create a default volcano plot from the EnhancedVolcano:
EnhancedVolcano(Welch.test,
                lab = Welch.test$Lipids,
                x = "log.2.FC", 
                y = "p.adj",
                xlab = "Fold change", 
                ylab = "FDR corrected p-value",
                title = "Alterations in plasma lipid profile in PDAC",
                subtitle = "Measured using SFC/MS")

The plot:

Through the caption argument, you can also specify a plot caption. Using axisLabSize, titleLabSize, subtitleLabSize, captionLabSize - the size of the axis titles, plot title, subtitle, and caption can be changed.

However, our axes present -log10(Welch p-value) and log2(fold change). Using the bquote() function, we can add math notations to our plot in the x-y axes labels. Just copy the bquote() expressions provided by us to xlab and ylab:

# Adding math notations to x-y axes labels:
EnhancedVolcano(Welch.test,
                lab = Welch.test$Lipids,
                x = "log.2.FC", 
                y = "p",
                pCutoffCol = "p.adj",
                xlab = bquote(~log[2]~ "fold change"), # Add log2 to fold change for x-axis label
                ylab = bquote(~-log[10]~ "Welch p-value"), # Add -log10 to Welch p-value for y-axis label 
                title = "Alterations in plasma lipid profile in PDAC",
                subtitle = "Measured using SFC/MS",
                titleLabSize = 16,
                subtitleLabSize = 12,
                captionLabSize = 10,
                axisLabSize = 12)

The modified labels:

More about using bquote() for creating math annotations can be found in the documentation:

# Checking bquote() documentation:
?bquote()

And on r-bloggers:

The basic rule is that strings are always in quotes ("..."), wrapped with a tilde separator: "Hello"~. Subscripts are in [...]: log[2], and power of is denoted by ^, e.g. 10^4. If you want to put a variable in the label - use .(). More about bquote() in the sources mentioned above.

Let's customize this plot. The EnhancedVolcano() contains numerous arguments allowing for flexible modifications.

First, let's change the cut-off points first. Via pCutoff a p-value cut-off point can be changed, and a horizontal line is drawn. Authors of the package also take into consideration adjusted p-values. If as y unadjusted p-values are delivered, via pCutoffCol adjusted p-values can be supplied. In this case, the volcano plot relies on -log10(unadjusted p-value), but the p-value cut-offs are set using pCutoffCol. For fold change, the cut-off lines are based on FCcutoff, and vertical lines are added at positive and negative cut-offs. The look of the lines can be changed via:

cutoffLineType - we can select: 'blank', 'solid', 'dashed', 'dotted', 'dotdash', 'longdash', 'twodash'.
cutoffLineCol - used to change the color of FC and p-value cut-off lines.
cutoffLineWidth - used to change the width of cut-off lines.

Let's use these arguments:

# Modifying cut-off lines for fold change/p-value:
EnhancedVolcano(Welch.test,
                lab = Welch.test$Lipids,
                x = "log.2.FC", 
                y = "p",                     # Unadjusted p-value as y.
                pCutoffCol = "p.adj",      # pCutoffCol for selecting p-value cut-offs.
                xlab = bquote(~log[2]~ "fold change"), 
                ylab = bquote(~-log[10]~ "Welch p-value"),
                title = "Alterations in plasma lipid profile in PDAC",
                subtitle = "Measured using SFC/MS",
                pCutoff = 0.05,           # Cut-off for adjusted p-values.
                FCcutoff = 0.3,           # Fold change cut-off.
                cutoffLineType = "dotted", # Changing type of cut-off lines.
                cutoffLineCol = "black",    # Color of cut-off lines.
                cutoffLineWidth = 1)      # Width of cut-off lines.

Our volcano after changes:

Changing the cut-off points created a new group in the volcano: variables significant according to p-value and fold change (red points). For this group, EnhancedVolcano() automatically added lipid names as labels. However, these labels overlap, making it difficult to understand which points are actually annotated. We can modify them through:

labSize - changes the size of all labels,
labCol - changes the color of all labels,
and labFace - changes the font face of all labels.

The labels can be drawn in boxes - you can set boxedLabels to TRUE.

To avoid ambiguous annotations, connections can be drawn between points and labels. To do so, set drawConnections to TRUE. You can change the look of connectors using:

widthConnectors - changes the line width of connectors,
min.segment.length - is used to specify the minimum length of connector line segments,
directionConnectors - direction for drawing connectors: can be set to 'both', 'x', 'y',
typeConnectors - used to set if the arrowhead of connectors should be open or filled,
endsConnectors - used to define which end of the connector should have the arrowhead: can be set to 'last', 'first', 'both',
lengthConnectors - changes the arrowhead size (length),
colConnectors - changes the color of connectors and connector line segments.

If you want to remove arrowheads, set arrowheads to FALSE. This way, you will only keep straight lines. To avoid overlaps in labels, you can define max.overlaps (for Inf - presents all labels, for 0 - presents no labels). We select some of these options and use them to improve the readability of our volcano plot:

# Adjusting labels' look:
EnhancedVolcano(Welch.test,
                lab = Welch.test$Lipids,
                x = "log.2.FC", 
                y = "p",
                pCutoffCol = "p.adj",
                xlab = bquote(~log[2]~ "fold change"), 
                ylab = bquote(~-log[10]~ "Welch p-value"),
                title = "Alterations in plasma lipid profile in PDAC",
                subtitle = "Measured using SFC/MS",
                titleLabSize = 16,
                subtitleLabSize = 12,
                captionLabSize = 10,
                axisLabSize = 12,
                pCutoff = 0.05,
                FCcutoff = 0.3,
                cutoffLineType = "dotted",
                cutoffLineCol = "black",
                cutoffLineWidth = 1,
                labSize = 4,
                drawConnectors = T,
                max.overlaps = 10,
                boxedLabels = T,
                widthConnectors = 0.75,
                endsConnectors = 'first',
                typeConnectors = 'closed')

The plot:

NOTE: Find appropriate settings for your volcano plot to present labels. Remember, it is good to label as many variables as possible, but ambiguous and overlapping annotations are not useful for readers of your manuscript. Therefore, first, focus on labeling the most important variables - in the left and right upper corners. Keep the annotations only if it's possible to understand to what point they correspond easily - as in the example above. A plot crowded with ambiguous annotations should be appropriately adjusted and corrected. If it is impossible to do this easily in R or Python, then you can use any software for processing graphics.

Now, let's change the size, colors, and shapes of points corresponding to all measured lipids. For this purpose, we can use:

pointSize argument for changing the size of points; you can specify a number or a vector of sizes,
col - by default, it corresponds to groups proposed in the legend by authors: NS, Log2FC, p-value, and p-value and Log2FC; using col, four new colors can be selected,
colCustom can be used to override the default color scheme through a vector containing new colors; remember that the order of new colors in the vector used to override default settings must match the order of variables in the toptable,
colAlpha - is used to change the opacity of points,
colGradient - if it is active, then the discrete scale of colors is substituted by a continuous scale of colors; define, e.g., c('red2', 'royalblue'), to see the effect,
colGradientBreaks - breakpoints for the colors specified in colGradient,
colGradientLabels - labels for breakpoints,
colGradientLimits - limits for the color scheme proposed in colGradient,
shape - defines shape for the plotted points; for the default setting of four groups - expects a maximum of four different shapes,
shapeCustom - works similarly to colCustom, and it can be used to override the default setting of shapes; remember that the order of new shapes in the vector must match the order of variables in the toptable!

Let's use these arguments to improve the look of our plot. First, we will increase the size of points and the default four colors:

# Changing points size and default colors:
EnhancedVolcano(Welch.test,
                lab = Welch.test$Lipids,
                x = "log.2.FC", 
                y = "p",
                pCutoffCol = "p.adj",
                xlab = bquote(~log[2]~ "fold change"), 
                ylab = bquote(~-log[10]~ "Welch p-value"),
                title = "Alterations in plasma lipid profile in PDAC",
                subtitle = "Measured using SFC/MS",
                titleLabSize = 16,
                subtitleLabSize = 12,
                captionLabSize = 10,
                axisLabSize = 12,
                pCutoff = 0.05,
                FCcutoff = 0.3,
                cutoffLineType = "dotted",
                cutoffLineCol = "black",
                cutoffLineWidth = 1,
                labSize = 4,
                drawConnectors = T,
                max.overlaps = 10,
                boxedLabels = T,
                widthConnectors = 0.75,
                endsConnectors = 'first',
                typeConnectors = 'closed',
                pointSize = 5,
                col = c('grey', 'grey','orange','red2'))

We obtain:

We can change the groups and select our criteria for colors and shapes, e.g., let's begin by coloring red2 significant upregulations in T and royalblue significant downregulations in T. We must create a vector containing colors in the same order as our variables in the toptable. The column names in this vector should be the new groups we want to introduce. Such a vector can be delivered to colCustom to adjust colors according to preferences.

Let's create three groups with three distinct colors: "Significant upregulation" (red2), "Significant downregulation" (royalblue), and "Insignificant results" (gray).

We will use ifelse() function to create the new vector with colors.

First, we need to interpret the outcomes for every variable in the 'Welch.test' tibble, i.e., every p.adj and every fold change, and define if, for a particular variable, we observe significant up- or downregulation or there is no trend at all:

# First, let's create a vector defining significant up-/down-regulations.
# We will use nested ifelse() function for this purpose.
# Check the documentation of this function:
?ifelse()

# It is simple to use:
# ifelse(test, yes operation, no operation)

# We will need two conditions: first, the p.adj < 0.05 and `log.2.FC` > 0.3:
new.criteria <- 
ifelse(Welch.test$p.adj < 0.05 & Welch.test$`log.2.FC` > 0.3, 'Significant upregulation', ....)

# For every variable with p.adj < 0.05 and `log.2.FC` > 0.3 we create label: "Significant upregulation".
# Now, let's begin the second condition (nested ifelse):
Welch.test$new.criteria <-
ifelse(Welch.test$p.adj < 0.05 & Welch.test$`log.2.FC` > 0.3, 'Significant upregulation', 
         ifelse(Welch.test$p.adj < 0.05 & Welch.test$`log.2.FC` < -0.3, 'Significant downregulation', 'Insignificant')
         )

# After comma, we open a new ifelse() for second condition - it is called nested ifelse().
# In the second condition, we look for variables with p.adj < 0.05 and `log.2.FC` < -0.3.
# Such variables are labeled "Significant downregulation".
# If a variable does not match any of these two criteria - we label it "Insignificant".
# We store the vector we create as 'new.criteria' in the global environment.

Half-way there! Now, as we know, if a variable is significantly up- or downregulated and we have this result stored as a vector, we can set colors for every entry (here: red2 /royalblue). All insignificant outcomes will be colored gray. We now create the vector with color names, which will be delivered to the EnhancedVolcano() function. We will again use the nested ifelse():

# Creating vector with colors:
new.colors <- 
  ifelse(Welch.test$new.criteria == 'Significant upregulation', "red2", 
                                ifelse(Welch.test$new.criteria == 'Significant downregulation', 'royalblue', 'grey'))

For every entry in new.criteria vector named "Significant upregulation" - we set the "red2", for "Significant downregulation - "royalblue", for none of these outcomes, we apply "grey". The outcomes from the nested ifelse() are stored as 'new.colors' vector.

Now, we have our vector with colors, which can be delivered to the colCustom. However, in the vector new.colors, above every color, we need to put a column name matching new groups. In our case: above every red2 - "Significant upregulation", above every royalblue - "Significant downregulation", and for all grey - "Insignificant". A simple way to do this can be found below:

# Merging new.creteria vector with new.color vector (creating color - group pairs):
names(new.colors)[new.colors == 'grey'] <- 'Insignificant'
names(new.colors)[new.colors == 'red2'] <- 'Significant upregulation'
names(new.colors)[new.colors == 'royalblue'] <- 'Significant downregulation'

# Here, for every 'grey' entry in new.colors, the column name will be 'Insignificant'.
# For every 'red2' entry, the column name will be 'Significant upregulation'.
# etc.

A glimpse at the vector before delivering it to colCustom:

And the updated plot code:

# Customizing the color of points - we deliver new.colors to colCustom:
EnhancedVolcano(Welch.test,
                lab = Welch.test$Lipids,
                x = "log.2.FC", 
                y = "p",
                pCutoffCol = "p.adj",
                xlab = bquote(~log[2]~ "fold change"), 
                ylab = bquote(~-log[10]~ "Welch p-value"),
                title = "Alterations in plasma lipid profile in PDAC",
                subtitle = "Measured using SFC/MS",
                titleLabSize = 16,
                subtitleLabSize = 12,
                captionLabSize = 10,
                axisLabSize = 12,
                pCutoff = 0.05,
                FCcutoff = 0.3,
                cutoffLineType = "dotted",
                cutoffLineCol = "black",
                cutoffLineWidth = 1,
                labSize = 4,
                drawConnectors = T,
                max.overlaps = 10,
                boxedLabels = T,
                widthConnectors = 0.75,
                endsConnectors = 'first',
                typeConnectors = 'closed',
                pointSize = 3,
                colCustom = new.colors)

The updated plot:

We can proceed similarly with shapes, but let's change the shape depending on the lipid class. First, we need to extract every lipid class from the Lipids column. Here, we will use a tailored stringr solution to tackle this issue. The library stringr is a part of the tidyverse collection:

# Calling library:
library(tidyverse)

# Reading documentation:
?word()

# Using word() function to extract lipid classes from the `Lipids` column:
Lipid.class <- word(Welch.test$Lipids, 1)

# This operation could also be performed using regex and gsub:
gsub( " .*$", "", Welch.test$Lipids)

# Regex: space, . - any character, * - any number of times, $ - until the end of string

# And we change the Lipid.class into a tibble (it will simplify the next step):
Lipid.class <- as_tibble(Lipid.class)

Now, for every lipid class in the 'Lipid.class' tibble, we must select a number corresponding to the ggplot2 type of shape. Since this requires numerous nested ifelse() functions, we will rather use a more elegant dplyr option: mutate() with case_when(). We can use it because our 'Lipid.class' is a tibble, and we will add a new column there called 'new.shapes':

# Adding `new.shapes` column to the 'Lipid.class' tibble:
Lipid.class <-
  Lipid.class %>%
  mutate(., new.shapes = with(., case_when(
    (value == "CE") ~ 0,
    (value == "Cer") ~ 18,
    (value == "DG") ~ 1,
    (value == "LPC") ~ 2,
    (value == "PC") ~ 17,
    (value == "SM") ~ 15,
    (value == "MG") ~ 16,
    (value == "TG") ~ 3)))

This column can be extracted to create a vector with new shapes for shapeCustom. We need to transpose it and specify that it is a vector:

# Creating 'new.shapes' vector from the 'Lipid.class' tibble:
new.shapes <- as.vector(t(Lipid.class$new.shapes))

Now, we need to match our shapes (numbers) with lipid class names (groups):

# Matching shape numbers with lipid classes:
names(new.shapes)[new.shapes == 0] <- 'CE'
names(new.shapes)[new.shapes == 18] <- 'Cer'
names(new.shapes)[new.shapes == 1] <- 'DG'
names(new.shapes)[new.shapes == 2] <- 'LPC'
names(new.shapes)[new.shapes == 17] <- 'PC'
names(new.shapes)[new.shapes == 15] <- 'SM'
names(new.shapes)[new.shapes == 16] <- 'TG'
names(new.shapes)[new.shapes == 3] <- 'MG'

We are ready to modify our volcano plot:

# Modifying the shape of points depending on the lipid class:
EnhancedVolcano(Welch.test,
                lab = Welch.test$Lipids,
                x = "log.2.FC", 
                y = "p",
                pCutoffCol = "p.adj",
                xlab = bquote(~log[2]~ "fold change"), 
                ylab = bquote(~-log[10]~ "Welch p-value"),
                title = "Alterations in plasma lipid profile in PDAC",
                subtitle = "Measured using SFC/MS",
                titleLabSize = 16,
                subtitleLabSize = 12,
                captionLabSize = 10,
                axisLabSize = 12,
                pCutoff = 0.05,
                FCcutoff = 0.3,
                cutoffLineType = "dotted",
                cutoffLineCol = "black",
                cutoffLineWidth = 1,
                labSize = 4,
                drawConnectors = T,
                max.overlaps = 10,
                boxedLabels = T,
                widthConnectors = 0.75,
                endsConnectors = 'first',
                typeConnectors = 'closed',
                pointSize = 3,
                colCustom = new.colors,
                shapeCustom = new.shapes,
                legendPosition = "right")    # We move the legend to the right side.

The plot:

Specify colors or color codes in the mutate() function with case_when() if you want to color by lipid classes instead of changing the points' shape. Create a vector with new.colors by selecting the appropriate column from 'Lipid.class', transpose it, and match colors with lipid class names:

# Color points by lipid classes.
# Call tidyverse
library(tidyverse)

# Using word() function to extract lipid classes from the `Lipids` column:
Lipid.class <- word(Welch.test$Lipids, 1)

# Change the Lipid.class into a tibble (it will simplify the next step):
Lipid.class <- as_tibble(Lipid.class)

# Adding `new.colors` column to the 'Lipid.class' tibble:
Lipid.class <-
  Lipid.class %>%
  mutate(., new.colors = with(., case_when(
    (value == "CE") ~ "blue",
    (value == "Cer") ~ "red2",
    (value == "DG") ~ "green",
    (value == "LPC") ~ "darkgreen",
    (value == "PC") ~ "violet",
    (value == "SM") ~ "orange",
    (value == "MG") ~ "pink",
    (value == "TG") ~ "darkblue")))

# Creating 'new.colors' vector from the 'Lipid.class' tibble:
new.colors <- as.vector(t(Lipid.class$new.colors))

# Matching colors with lipid classes:
names(new.colors)[new.colors == "blue"] <- 'CE'
names(new.colors)[new.colors == "red2"] <- 'Cer'
names(new.colors)[new.colors == "green"] <- 'DG'
names(new.colors)[new.colors == "darkgreen"] <- 'LPC'
names(new.colors)[new.colors == "violet"] <- 'PC'
names(new.colors)[new.colors == "orange"] <- 'SM'
names(new.colors)[new.colors == "darkblue"] <- 'TG'
names(new.colors)[new.colors == "pink"] <- 'MG'

# Changing the color of points in the volcano plot - according to lipid class:
EnhancedVolcano(Welch.test,
                lab = Welch.test$Lipids,
                x = "log.2.FC", 
                y = "p",
                pCutoffCol = "p.adj",
                xlab = bquote(~log[2]~ "fold change"), 
                ylab = bquote(~-log[10]~ "Welch p-value"),
                title = "Alterations in plasma lipid profile in PDAC",
                subtitle = "Measured using SFC/MS",
                titleLabSize = 16,
                subtitleLabSize = 12,
                captionLabSize = 10,
                axisLabSize = 12,
                pCutoff = 0.05,
                FCcutoff = 0.3,
                cutoffLineType = "dotted",
                cutoffLineCol = "black",
                cutoffLineWidth = 1,
                labSize = 4,
                drawConnectors = T,
                max.overlaps = 10,
                boxedLabels = T,
                widthConnectors = 0.75,
                endsConnectors = 'first',
                typeConnectors = 'closed',
                pointSize = 3,
                colCustom = new.colors,
                legendPosition = "right")

The new plot:

Finally, we can modify the plot background, axes, and legend:

hline - you can add horizontal lines passing through y-axis,
hlineType - change horizontal line type,
hlineCol - change horizontal line color,
hlineWidth - change horizontal line width,
vline - add vertical lines passing through x-axis,
vlineType - change vertical line type,
vlineCol - change vertical line color,
vlineWidth - change vertical line width,
gridlines.major - logical - whether draw major gridlines,
gridlines.minor - logical - whether draw minor gridlines,
border - add border for x and y axes ('partial') or around the entire chart ('full'),
borderWidth - border width,
borderColour - border color,
xlim - x-axis limits,
ylim - y-axis limits,
legendLabels - plot legend text labels,
legendPosition - used to set the position of the legend ('bottom', 'top', 'left', 'right'),
legendLabSize - size of plot legend text,
legendIconSize - size of legend symbols.

# Changing plot background, range of axes, and legend style:
EnhancedVolcano(Welch.test,
                lab = Welch.test$Lipids,
                x = "log.2.FC", 
                y = "p",
                pCutoffCol = "p.adj",
                xlab = bquote(~log[2]~ "fold change"), 
                ylab = bquote(~-log[10]~ "Welch p-value"),
                title = "Alterations in plasma lipid profile in PDAC",
                subtitle = "Measured using SFC/MS",
                caption = "In total = 127 lipids",
                titleLabSize = 16,
                subtitleLabSize = 12,
                captionLabSize = 10,
                axisLabSize = 12,
                pCutoff = 0.05,
                FCcutoff = 0.3,
                cutoffLineType = "dashed",
                cutoffLineCol = "black",
                cutoffLineWidth = 1,
                labSize = 4,
                drawConnectors = T,
                max.overlaps = 10,
                boxedLabels = T,
                widthConnectors = 0.75,
                endsConnectors = 'first',
                typeConnectors = 'closed',
                pointSize = 3,
                colCustom = new.colors,
                hline = c(0.05,10e-5,10e-10,10e-15,10e-20),
                hlineType = c('dashed','dotted','dotted','dotted','dotted'),
                hlineCol = c('black','gray','gray','gray','gray'),
                vline = c(-1,1),
                vlineType = "dotted",
                vlineCol = "red2",
                xlim = c(-1.7,1.7),
                ylim = c(0, 22),
                gridlines.major = FALSE,
                gridlines.minor = FALSE,
                border = "full",
                legendPosition = "right",
                legendLabSize = 12,
                legendIconSize = 4)

The plot:

Or without any h- and vlines in the background:

# Changing plot background, range of axes, and legend style - proposal 2:
EnhancedVolcano(Welch.test,
                lab = Welch.test$Lipids,
                x = "log.2.FC", 
                y = "p",
                pCutoffCol = "p.adj",
                xlab = bquote(~log[2]~ "fold change"), 
                ylab = bquote(~-log[10]~ "Welch p-value"),
                title = "Alterations in plasma lipid profile in PDAC",
                subtitle = "Measured using SFC/MS",
                caption = "In total = 127 lipids",
                titleLabSize = 16,
                subtitleLabSize = 12,
                captionLabSize = 10,
                axisLabSize = 12,
                pCutoff = 0.05,
                FCcutoff = 0.3,
                cutoffLineType = "dashed",
                cutoffLineCol = "black",
                cutoffLineWidth = 1,
                labSize = 4,
                drawConnectors = T,
                max.overlaps = 10,
                boxedLabels = T,
                widthConnectors = 0.75,
                endsConnectors = 'first',
                typeConnectors = 'closed',
                pointSize = 3,
                colCustom = new.colors,
                gridlines.major = FALSE,
                gridlines.minor = FALSE,
                border = "full",
                legendPosition = "right",
                legendLabSize = 12,
                legendIconSize = 4,
                xlim = c(-2,2))

The plot:

More volcano plot designs and details about EnhancedVolcano() function can be found in the meticulous vignette prepared by the authors. We encourage you to read it:

Creating interactive volcano plots through plotly (advanced)

Using plotly, interactive volcano plots can be generated to facilitate, e.g., data exploratory analysis. In the first step, we again compute the Welch t-test and mean-based fold changes:

# Computing Welch t-test and fold change.
# Calling libraries:
library(rstatix)
library(ggpubr)
library(tidyverse)

# Step 1 - Create 'data.long' tibble:
data.long <- 
  data %>%
  select(-`Sample Name`) %>%
  filter(Label != "PAN") %>%
  pivot_longer(cols = `CE 16:1`:`SM 42:1;O2`,
               names_to = "Lipids",
               values_to = "Concentrations")

# Step 2 - Compute Welch's t-test and adjust p-values (rstatix):
Welch.test <- 
  data.long %>%
  group_by(Lipids) %>%
  t_test(Concentrations ~ Label) %>%
  adjust_pvalue(method = "BH")

# Step 3 - Compute mean concentrations for all lipids within healthy controls (N):
Lipids.N <- 
  data.long %>%
  group_by(Lipids, Label) %>%
  summarise(mean = mean(Concentrations)) %>%
  arrange(Label) %>%
  filter(Label == "N")

# Step 4 - Compute mean concentrations for all lipids within patients with PDAC (T):
Lipids.T <- 
  data.long %>%
  group_by(Lipids, Label) %>%
  summarise(mean = mean(Concentrations)) %>%
  arrange(Label) %>%
  filter(Label == "T")

# Step 5 - Compute fold-change and store it in the "Welch.test" as `Fold change`:
Welch.test$`Fold change` <- Lipids.T$mean/Lipids.N$mean

# Step 6 - Using mutate(), compute log2 of fold change:
Welch.test <-
  Welch.test %>%
  mutate(log.2.FC = log2(`Fold change`))

Now, before we use plotly to create a volcano plot - we will define four groups:

variables with log2(fold change) > 0.3 - Fold change,
variables with log2(fold change) > 0.3 and p.adj < 0.05 - Significant & Fold Change,
variables with log2(fold change) < 0.3 and p.adj < 0.05 - Significant,
and finally - variables with log2(fold change) < 0.03 and p.adj > 0.05 - Not significant.

# First, we add 'group' column to 'Welch.test' tibble:
Welch.test["group"] <- "Not Significant"

# To avoid testing for one condition - we will this column with "Not significant".
# Accessing p.adj and log.2.FC columns row-by-row, we test the condition and place the solution in the 'group' column:
Welch.test[which(Welch.test['p.adj'] < 0.05 & abs(Welch.test['log.2.FC']) < 0.3 ),"group"] <- "Significant"

# First, for all p.adj < 0.05 and log2(FC) < 0.3.
# Then, for all p.adj > 0.05 and log2(FC) > 0.3:
Welch.test[which(Welch.test['p.adj'] >= 0.05 & abs(Welch.test['log.2.FC']) > 0.3 ),"group"] <- "Fold Change"

# Finally, for p.adj < 0.05 & log2(FC) > 0.3:
Welch.test[which(Welch.test['p.adj'] < 0.05 & abs(Welch.test['log.2.FC']) > 0.3),"group"] <- "Significant & Fold Change"
 
# To avoid checking the conditions separately for negative and positive fold change, we use abs() function.
# It returns absolute value.

# Finally, we build a vector with colors for each group:
colors <- c('black', 'grey', 'royalblue', 'red2')

# We set the column name for every color corresponding to one of four groups:
colors <- setNames(colors, c("Fold Change", "Not Significant", "Significant", "Significant & Fold Change"))

# This is exactly the same type of vector as we built above for EnhancedVolcano()!

# Now, the plot code:
plot_ly(x = Welch.test$log.2.FC, 
        y = -log10(Welch.test$p.adj), 
        text = Welch.test$Lipids, 
        mode = "markers", 
        color = Welch.test$group, 
        colors = colors) %>%   
  layout(title = "Alterations in plasma lipidome in PDAC")
  
# We specify x - log.2.FC, y - as -log10 of p.adj,
# As labels (text) - we use lipid names,
# We want a simple scatter plot, and plot points = "markers",
# We color according to created groups color = Welch.test$group,
# Use colors from the vector colors,
# Finally, add a title through the layout.

In effect, we obtain this beautiful interactive volcano chart:

PreviousResults of tests as annotations in the charts NextLipid maps and acyl-chain plots

Last updated 3 months ago