The `preprocess_rna_seq_data()` function performs essential preprocessing steps for raw RNA-seq counts. This includes creating a `DGEList` object, normalizing the counts using the default TMM (Trimmed Mean of M-values) normalization via the `edgeR::calcNormFactors` function, and applying the `voom` transformation from the `limma` package to obtain log-transformed counts per million (logCPM) with associated precision weights. If you require a different normalization method, you can supply your own custom normalization function.
Arguments
- splineomics
An S3 object of class `SplineOmics` that must contain the following elements:
data
: A matrix of the omics dataset, with feature names optionally as row headers (genes as rows, samples as columns).meta
: A dataframe containing metadata corresponding to thedata
. The dataframe must include a 'Time' column and a column specified by thecondition
.design
: A character string representing the design formula for the limma analysis (e.g.,'~ 1 + Phase*X + Reactor'
).spline_params
: A list of spline parameters used in the analysis. This can include:spline_type
: A character string specifying the type of spline. Must be either'n'
for natural cubic splines or'b'
for B-splines.dof
: An integer specifying the degrees of freedom. Required for both natural cubic splines and B-splines.degree
: An integer specifying the degree of the spline (for B-splines only).knots
: Positions of the internal knots (for B-splines).bknots
: Boundary knots (for B-splines).
dream_params
: A named list orNULL
. When notNULL
, it can contain:dof
: An integer greater than 1, specifying the degrees of freedom for the dream topTable.KenwardRoger
: A boolean indicating whether to use the Kenward-Roger method.
- normalize_func
An optional normalization function. If provided, this function will be used to normalize the `DGEList` object. If not provided, TMM normalization (via `edgeR::calcNormFactors`) will be used by default. Must take as input the y of: y <- edgeR::DGEList(counts = raw_counts) and output the y with the normalized counts.
Value
A `voom` object, which includes the log2-counts per million (logCPM) matrix and observation-specific weights.