Application of pipe (%>%) functions
Useful tricks and features in OMICs mining
Last updated
Useful tricks and features in OMICs mining
Last updated
Before we start to work with the content of tibbles, namely sort, arrange, slice, select, filter, mutate, etc., we need to start using 'pipes'. Pipes are handy functions that allow for organizing data preparation and analysis, including manipulating the content of data frames, data transformation and normalization, computation, and plotting. Different computational operations on data can be chained using pipes, creating a pipeline. Pipelines allow for an output from a function to be immediately passed to another function for further actions. This develops a chain of actions, processing an R object into desired outputs through an elegant, simple-to-read, well-organized code block. Again, we will highlight that pipes are functions and are used to process one primary R object at a time. Pipes are supplied from the magrittr package (), which is installed together with dplyr package from the tidyverse collection. Therefore, to use pipes, we will call the tidyverse collection. In the code, pipes from the magrittr are denoted as %>%.
More information about pipes you can find here:
If a function used in the pipeline has long arguments, it is good to split them and put each argument on its own line.
If we want to keep the output from the pipeline as an R object, we can assign it at the beginning, in one line of code with a pipeline, or in two separate lines; the tidyverse style guide allows assigning objects at the end of a pipeline, too.
However, in this Gitbook, we will mostly rely on the first option. Below, you will find examples of pipelines adhering to these rules:
Now, take a look at the following block of code presenting long pipelines:
The output of both pipelines looks like this:
Finally, we will show you how you can store the outputs of pipelines. Please see the code block below:
You can find the script containing all of the code blocks here:
Now that you know how the pipe function works, let's actively use it for OMICs analysis in the following chapters!
According to the tidyverse style guide (), in the code, before a pipe symbol, one should leave a space. Every new step in the pipeline should be started from a new line to increase readability, except for very short pipelines. After the first step in the pipeline, the next step in the new line should be indented by two spaces. This way, it is easier to add additional lines of code and less likely to overlook a step of a process.