Detecting missing values
Required packages
The required packages for this section are pandas, matplotlib and seaborn. These can be installed with the following command in the command window (Windows) / terminal (Mac).
pip install pandas matplotlib seabornLoading the data
Place the downloaded Lipidomics_missing_values_EXAMPLE.xlsx file in the same folder as your JupyterLab script. Then run the following code in Jupyter:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.read_excel("Matrix_missing_values_EXAMPLE.xlsx", decimal=",")
df.set_index("Sample Name", inplace=True)We can generate a heatmap visualisation of the missing values across the table (white values indicate a missing value):
plt.figure(figsize=(24, 30)) # Modify the width and height as needed
sns.heatmap(df.isnull(), cbar=False)
plt.savefig("missing_values_heatmap.png", dpi=200, bbox_inches='tight')
plt.show()
We can visualize the % missing values in the samples:

And for the species:

PreviousReplacing NAs via random forest (RF) model (randomForest library)NextFiltering out columns containing mostly NAs
Last updated