create_splineomics() — create_splineomics • SplineOmics

Creates a SplineOmics object containing variables that are commonly used across multiple functions in the package. This object is then passed as an argument to the other functions of this package.

Usage

create_splineomics(
  data,
  meta,
  condition,
  rna_seq_data = NULL,
  annotation = NULL,
  report_info = NULL,
  meta_batch_column = NULL,
  meta_batch2_column = NULL,
  feature_name_columns = NULL,
  design = NULL,
  use_array_weights = TRUE,
  dream_params = NULL,
  mode = NULL,
  spline_params = NULL,
  padjust_method = "BH",
  bp_cfg = NULL
)

Arguments

data

The actual omics data. In the case the rna_seq_data argument is used, still provide this argument. In that case, input the data matrix in here (for example the $E part of the voom object). Assign your feature names as row headers (otherwise, just numbers will be your feature names).

meta

Metadata associated with the omics data.

condition

A condition variable.

rna_seq_data

An object containing the preprocessed RNA-seq data, such as the output from `limma::voom` or a similar preprocessing pipeline. This argument is not subjected to input control. Rather, in that regard it relies on the input control from the `limma::lmfit` function.

annotation

A dataframe with the feature descriptions of data (optional).

report_info

A named list describing the experiment. Must include the following fields: - "omics_data_type" - "data_description" - "data_collection_date" - "analyst_name" - "contact_info" - "project_name"

May also include the following optional fields: - "method_description" - "results_summary" - "conclusions"

meta_batch_column

Column for meta batch information (optional).

meta_batch2_column

Column for secondary meta batch information (optional).

feature_name_columns

Character vector containing the column names of the annotation info that describe the features. This argument is used to specify in the HTML report how exactly the feature names displayed above each individual spline plot have been created. Use the same vector that was used to create the row headers for the data matrix!

design

A design matrix or similar object (optional).

use_array_weights

Boolean flag indicating if the robust fit strategy to deal with heteroscedasticity should be used or not. If set to NULL, then this is handeled implicitly based on the result of the Levene test. If this test is significant for at least 10 used. The robust strategy uses the function voomWithQualityWeights for RNA-seq data instead of the normal voom function. For other, non-count-based data, the function limma::arrayWeights is used instead, combined with setting the robust argument to TRUE for the limma::eBayes function. In summary, the strategy employed by those functions is to downweights samples with higher variance. This can be neccessary, because linear models have the assumption of homoscedasticity, which means that the variance is (approx.) the same across all datapoints where the linear model is fitted. If this is violated, then the resulting p-values cannot be trusted (common statistical wisdom).

dream_params

#' A named list or NULL. When not NULL, it can contain the following named elements: - `dof`: An integer greater than 1, specifying the degrees of freedom for the dream topTable. - `KenwardRoger`: A boolean indicating whether to use the Kenward-Roger approximation for mixed models. Note that random effects are now directly specified in the design formula and not in `dream_params`.

mode

For the design formula, you must specify either 'isolated' or 'integrated' for the mode. Isolated means limma determines the results for each level using only the data from that level. Integrated means limma determines the results for all levels using the full dataset (from all levels).

spline_params

Parameters for spline functions (optional). Must contain the named elements spline_type, which must contain either the string "n" for natural cubic splines, or "b", for B-splines, the named element degree in the case of B-splines, that must contain only an integer, and the named element dof, specifying the degree of freedom, containing an integer and required both for natural and B-splines.

padjust_method

Method for p-value adjustment, one of "none", "BH", "BY", "holm", "bonferroni", "hochberg", or "hommel". Defaults to "BH" (Benjamini-Hochberg).

bp_cfg

A named numeric vector specifying the parallelization configuration, with expected names `"n_cores"` and `"blas_threads"`.

This controls how many **R worker processes** (`n_cores`) and how many **BLAS/OpenBLAS threads per process** (`blas_threads`) should be used during parallel computation.

If `bp_cfg` is `NULL`, missing, or any of its required fields is `NA`, both `n_cores` and `blas_threads` default to `1`. This effectively disables parallelization and avoids oversubscription of CPU threads.

Value

A SplineOmics object.