Skip to contents

This function automatically generates exploratory data analysis (EDA) plots from the provided data. These include density plots, boxplots, PCA plots, MDS plots, variance explained plots, violin plots, mean correlation with time, first lag autocorrelation, lag1 differences, and coefficient of variation. The function returns all EDA plots as a list and, by default, creates an HTML report containing the plots, saving it to the specified report directory.

Usage

explore_data(splineomics, report_dir = here::here(), report = TRUE)

Arguments

splineomics

A named list containing all required inputs for the splineomics workflow. Must contain the following elements:

data

[numeric matrix] The data matrix with the values. The columns are the samples (timepoint + replicate combo) and the rows are the features (e.g. genes or proteins).

meta

[dataframe] A dataframe containing metadata corresponding to the data, must include a 'Time' column and any columns specified by conditions. In general, the columns of meta correspond to the different types of metadata, and each row corresponds to a column of data (contains the metadata for that sample).

annotation

[dataframe] A dataframe that maps the rows of data to annotation info, such as the gene name or database identifiers.

report_info

[named list] A named list describing the experiment. Must include the following fields: - "omics_data_type" - "data_description" - "data_collection_date" - "analyst_name" - "contact_info" - "project_name"

May also include the following optional fields: - "method_description" - "results_summary" - "conclusions"

condition

[character] Character vector of length 1 specifying the column name in meta used to define groups for analysis.

meta_batch_column

[character] Character vector of length 1 specifying the column name in meta that contains the info for the batch effect.

meta_batch2_column

[character] Character vector of length 1 specifying the column name in meta that contains the info for the second batch effect.

report_dir

A non-empty string specifying the report directory. The output HTML reports will be placed there. Default is the current working dir, determined with the here library: here::here().

report

A Boolean TRUE or FALSE value, specifying if a report should be generated or not.

Value

A list of ggplot objects representing various exploratory plots.