Example data sets
Data set for computations and visualizations used in this Gitbook
Last updated
Data set for computations and visualizations used in this Gitbook
Last updated
In this Gitbook, all computations and visualizations will be prepared based on an example serum lipidomics data set containing concentrations of 127 lipid species measured for 97 healthy volunteers (N), 21 patients with pancreatitis (PAN), and 109 patients with pancreatic cancer (PDAC, or T). To simplify further steps, we will assume that the data frame used for all computations and visualizations contains concentrations of an individual lipid for all patients in columns (1 column = all concentrations for 1 lipid) and concentrations of all lipid species for one patient in rows (1 row = all lipid concentrations for 1 patient). Such an Excel table must be loaded in R/Python for data analysis and visualization. You will find instructions on how to load your data in R/Python in the next subchapters.
All human serum samples were obtained from the Bank of Biological Material in Masaryk Memorial Cancer Institute in Brno (PDAC, healthy volunteers, pancreatitis) and the First and Third Faculty of Medicine at Charles University in Prague (pancreatitis), approved by the institutional ethical committees, and all blood donors gave informed consent. The sample selection was based on the availability of stored serum samples. The only exclusion criterion for healthy controls was the presence of malignant disease in the lifetime history, without any other exclusion criteria for other diseases. For all PDAC patients, the disease was confirmed by abdominal computed tomography and/or endoscopic ultrasound followed by needle biopsy or surgical resection. 21 patients with chronic pancreatitis treated at two outpatient departments were included. The pancreatitis was either ethanol-induced or recurrent acute pancreatitis, and it was confirmed by imaging methods (endoscopic ultrasound or endoscopic retrograde cholangiopancreatography). All involved institutes provided ethical approval and signed informed consent for blood collection. All PDAC patients, pancreatitis patients, and healthy controls were of Caucasian ethnicity.
Obtained serum samples were stored at −80 °C for further processing.
The lipid concentrations in the serum of PDAC patients and healthy controls were published in the following manuscript after normalization to the NIST plasma standard (NIST SRM 1950):
First, please download the following data set and have a look at it:
Data set no. 2 was created using Data set no. 1. Missing values were introduced into the PDAC lipidomics data set using the R programming language.
In columns, lipid concentrations, and in rows - concentrations for each patient.