Skip to contents

Performs clustering on hits from top tables generated by differential expression analysis. This function filters hits based on adjusted p-value thresholds, extracts spline coefficients for significant features, normalizes these coefficients, and applies hierarchical clustering. The results, including clustering assignments and normalized spline curves, are saved in a specified directory and compiled into an HTML report.

Usage

cluster_hits(
  splineomics,
  clusters,
  adj_pthresholds = c(0.05),
  adj_pthresh_avrg_diff_conditions = 0,
  adj_pthresh_interaction_condition_time = 0,
  genes = NULL,
  plot_info = list(y_axis_label = "Value", time_unit = "min", treatment_labels = NA,
    treatment_timepoints = NA),
  plot_options = list(cluster_heatmap_columns = FALSE, meta_replicate_column = NULL),
  report_dir = here::here(),
  report = TRUE
)

Arguments

splineomics

An S3 object of class `SplineOmics` that contains all the necessary data and parameters for the analysis, including:

  • data: The original expression dataset used for differential expression analysis.

  • meta: A dataframe containing metadata corresponding to the data, must include a 'Time' column and any columns specified by conditions.

  • design: A character of length 1 representing the limma design formula.

  • condition: Character of length 1 specifying the column name in meta used to define groups for analysis.

  • spline_params: A list of spline parameters for the analysis.

  • meta_batch_column: A character string specifying the column name in the metadata used for batch effect removal.

  • meta_batch2_column: A character string specifying the second column name in the metadata used for batch effect removal.

  • limma_splines_result: A list of data frames, each representing a top table from differential expression analysis, containing at least 'adj.P.Val' and expression data columns.

clusters

Character or integer vector specifying the number of clusters

adj_pthresholds

Numeric vector of p-value thresholds for filtering hits in each top table.

adj_pthresh_avrg_diff_conditions

p-value threshold for the results from the average difference of the condition limma result. Per default 0 ( turned off).

adj_pthresh_interaction_condition_time

p-value threshold for the results from the interaction of condition and time limma result. Per default 0 (turned off).

genes

A character vector containing the gene names of the features to be analyzed.

plot_info

List containing the elements y_axis_label (string), time_unit (string), treatment_labels (character vector), treatment_timepoints (integer vector). All can also be NA. This list is used to add this info to the spline plots. time_unit is used to label the x-axis, and treatment_labels and -timepoints are used to create vertical dashed lines, indicating the positions of the treatments (such as feeding, temperature shift, etc.).

plot_options

List with specific fields (cluster_heatmap_columns = Bool) that allow for customization of plotting behavior.

report_dir

Character string specifying the directory path where the HTML report and any other output files should be saved.

report

Boolean TRUE or FALSE value specifing if a report should be generated.

Value

A list where each element corresponds to a group factor and contains the clustering results, including `clustered_hits` data frame, hierarchical clustering object `hc`, `curve_values` data frame with normalized spline curves, and `top_table` with cluster assignments.

See also