cluster_hits.R contains the exported package function cluster_hits and all the functions that make up the functionality of cluster_hits. cluster_hits clusters the hits of a time series omics datasets (the features that were significantly changed over the time course) with hierarchical clustering of the spline shape. Cluster Hits from Top Tables
cluster_hits.Rd
Performs clustering on hits from top tables generated by differential expression analysis. This function filters hits based on adjusted p-value thresholds, extracts spline coefficients for significant features, normalizes these coefficients, and applies hierarchical clustering. The results, including clustering assignments and normalized spline curves, are saved in a specified directory and compiled into an HTML report.
Usage
cluster_hits(
splineomics,
clusters,
adj_pthresholds = c(0.05),
adj_pthresh_avrg_diff_conditions = 0,
adj_pthresh_interaction_condition_time = 0,
genes = NULL,
plot_info = list(y_axis_label = "Value", time_unit = "min", treatment_labels = NA,
treatment_timepoints = NA),
plot_options = list(cluster_heatmap_columns = FALSE, meta_replicate_column = NULL),
report_dir = here::here(),
report = TRUE
)
Arguments
- splineomics
An S3 object of class `SplineOmics` that contains all the necessary data and parameters for the analysis, including:
data
: The original expression dataset used for differential expression analysis.meta
: A dataframe containing metadata corresponding to thedata
, must include a 'Time' column and any columns specified byconditions
.design
: A character of length 1 representing the limma design formula.condition
: Character of length 1 specifying the column name inmeta
used to define groups for analysis.spline_params
: A list of spline parameters for the analysis.meta_batch_column
: A character string specifying the column name in the metadata used for batch effect removal.meta_batch2_column
: A character string specifying the second column name in the metadata used for batch effect removal.limma_splines_result
: A list of data frames, each representing a top table from differential expression analysis, containing at least 'adj.P.Val' and expression data columns.
- clusters
Character or integer vector specifying the number of clusters
- adj_pthresholds
Numeric vector of p-value thresholds for filtering hits in each top table.
- adj_pthresh_avrg_diff_conditions
p-value threshold for the results from the average difference of the condition limma result. Per default 0 ( turned off).
- adj_pthresh_interaction_condition_time
p-value threshold for the results from the interaction of condition and time limma result. Per default 0 (turned off).
- genes
A character vector containing the gene names of the features to be analyzed.
- plot_info
List containing the elements y_axis_label (string), time_unit (string), treatment_labels (character vector), treatment_timepoints (integer vector). All can also be NA. This list is used to add this info to the spline plots. time_unit is used to label the x-axis, and treatment_labels and -timepoints are used to create vertical dashed lines, indicating the positions of the treatments (such as feeding, temperature shift, etc.).
- plot_options
List with specific fields (cluster_heatmap_columns = Bool) that allow for customization of plotting behavior.
- report_dir
Character string specifying the directory path where the HTML report and any other output files should be saved.
- report
Boolean TRUE or FALSE value specifing if a report should be generated.