Correlation heat maps
Metabolites and lipids descriptive statistical analysis in R
Last updated
Metabolites and lipids descriptive statistical analysis in R
Last updated
Correlation heat maps are nothing else but large plots that display correlations between variables from a data set. These plots are frequently used in manuscripts to depict relationships between lipid or metabolite concentrations as well as their associations with clinical variables. These relationships can be further explored in the context of lipid (metabolite) metabolism, especially when they change in response to disease progression, recovery, etc.
For practical examples of correlation heat maps, refer to the following papers:
A. Jeucken et al. A Comprehensive Functional Characterization of Escherichia coli Lipid Genes. DOI: - Fig. 5 (a study in Cell Reports utilizing lipid-lipid correlation networks, i.e., exploring statistical relationships between 100 most abundant lipid species, and analysis of these relationships in the context of lipid metabolism in bacterium).
K. Huynh et al. Concordant peripheral lipidome signatures in two large clinical studies of Alzheimer’s disease. DOI: - Fig. 1C (a study on peripheral lipidome Alzheimer's signatures published in Nature Communications, presenting statistical relationships (Spearman correlation) between total lipid classes, subclasses, and commonly reported clinical measures).
B. Peng et al. Identification of key lipids critical for platelet activation by comprehensive analysis of the platelet lipidome. DOI: - Fig. 4D (absolute quantification of platelet lipidome published in Blood; authors used correlation heat map with hierarchical clustering for 384 quantified lipid species; 12 distinct clusters of correlated and anticorrelated lipids were identified during platelet activation (Pearson correlation ≥0.85)).
Y. Ding et al. Comprehensive metabolomics profiling reveals common metabolic alterations underlying the four major non-communicable diseases in treated HIV infection. DOI: - Fig. 5 (the authors of the study published in eBioMedicine (a part of the Lancet Discovery Science journals) presented in the form of a correlation heat map statistical relationships (Spearman correlation) between eigenmetabolite, altered metabolites, classical lipids, and clinical parameters in all participants).
L. Ottensmann et al. Genome-wide association analysis of plasma lipidome identifies 495 genetic associations. DOI: - Fig. 1b (the authors of a manuscript published in Nature Communications used a correlation heatmap for presenting the absolute pairwise Pearson correlations between the lipid species included in the 11 clusters of the multivariate genome-wide association studies).
M. Lange et al. AdipoAtlas: A reference lipidome for human white adipose tissue. DOI: - Fig. 4C (the authors use correlation heat maps for presenting Pearson's correlations of significantly regulated lipids between lean and obese WAT).
The correlation heat map can be obtained through the plot_correlation() from the DataExplorer package:
The heat maps:
The ggcorrplot package produces a ggplot2 visualization of the correlation matrix. Read more about the package here:
In the first step, we need to compute this matrix, and in the next step, a visualization is obtained:
Plot A:
Plot B:
The principle is the same as for ggcorrplot library, first, compute the matrix of correlations, then - visualize it. Read more about the possibilities offered by the corrplot library here:
As you see, we immediately customized the color of scales (-1 to 1). By changing the color codes in colorRampPalette, you can create your palette of colors for the heat map. The correlation heat map obtained from the code above looks like this:
IMPORTANT: The correlation heat map produced by corrplot has a white background around it. You can crop it in any of the freely available graphical software (the simplest method).