The 'for' loop in R (advanced)
Useful tricks and features in OMICs mining
Occasionally, in this GitBook, we will need to iterate over the components of a data frame, vector, matrix, or list because not all functions or libraries we present are compatible with the tidyverse solutions.
Note!
If you are a beginner, you can skip this chapter and return to it once we use loops in the GitBook (e.g., see Metabolites and Lipids Univariate Statistics in R).
Here, we will briefly introduce the most simple loop in R, i.e., the 'for' loop. The for loop is useful when you must repeat specific lines (or blocks) of code for each element in a vector or other object. With the solutions provided by the tidyverse, loops are now less commonly used in R. However, they remain essential in many other programming languages.
Let's analyze how the for loop works. Later, we will use the loop to compute the mean, standard deviation, median, and interquartile range for all columns with lipid concentrations of our PDAC data frame.
The for-loop construction is the following:
# The 'for' loop construction in R:
# Note: this code is an abstraction that will not actually run
for (i in vec) { # Here are defined start and end values of the loop.
Loop body: blocks of commands (statements)
}
# The 'i' represents values in a 'vec' vector: from the first to the last value of 'vec'.
# The loop takes every 'i' from 'vec' and evaluates the block of commands for it. In a simplified example, we want to use the for loop to recalculate the concentration of CE 18:2 from pmol/ml of plasma to nmol/ml of plasma:
# Recalculating plasma concentration of CE 18:2 from pmol/ml to nmol/ml
# First, we create a vector with CE 18:2 concentrations in our patients' plasma samples:
concentrations <- c(2100000, 1590231, 1891203, 1999142, 1567343)
# Next, we create the loop:
for (i in concentrations) {
value <- i / 1000 # Convert every i to nmol/ml and store it as 'value'.
print(paste(value, "nmol/ml")) # Print the outcomes in the R console with a new unit.
}
# Comments:
# For every entry in the 'concentrations' vector, marked as i,
# recalculate the unit from pmol/ml to nmol/ml (divide by 1000),
# and store as 'value'.
# Print the stored 'value' with a new unit, "nmol/ml," in the R console
## Here, we used the paste() function to connect every 'value' with a new unit 'nmol/ml'.The output in the R console:

Let's try a more complicated for loop. Imagine that the tidyverse tools have not yet been developed, and you plan to use the loop to compute mean concentration across all samples, as well as standard deviation, median, and interquartile region for the PDAC data set. We will rely on base R functions, including mean(), sd(), median(), and iqr().
Note!
We will use the PDAC data set from the Introduction: Example data sets:
Here is the code with explanations:
Though our computations are finished, the results would not be easy to analyze. For this reason, we will rearrange the list of tibbles into one tibble. We can do it in one line of code:
Outcome:

Note!
The data can be assembled immediately into one tibble within the for loop. Instead of creating resultslist storing every output as an element of the list, we can create resultstibble and add after every iteration a new row to this tibble. Take a look at this gentle modification:
This is just one example of how tidyverse simplified scripting in R - as you see, the loops can get complicated (and we haven't even started computing the results for every experimental group separately!).
Look at the following block of code (here, we will need two for loops):
Output:

For more information about loops used in R, please refer to the following books:
Last updated