Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Template for Changelog Entries

Each release section should follow the structure below:

[Version] - YYYY-MM-DD

Added

New features or functionality introduced in this release.

Changed

Updates or modifications to existing features.

Fixed

Bugs that have been resolved.

Deprecated

Features that are still functional but are slated for removal in the future.

Removed

Features or functionality that have been removed.

Security

Any security-related updates or patches.

Examples: - Added: Introduced a new plotting function plotTimeSeries(). - Fixed: Resolved an issue causing crashes when input data was missing values. - Changed: Modified the default parameters for normalizeData().

[0.2.1] - (in preparation)

[0.2.0] - 2025-05-08

Fixed

Bug that prevented the generation of the double spline plots when the condition column of meta was a factor instead of a string. This is now solved (it internally in the code converts it to a string), so having the condition column as a factor now does not cause problems anymore (as having it as factor is perfectly valid).
Bug that occured in the internal clean_gene_symbols() function when one of the gene names was NA. Now, this does not cause a problem anymore and instead of the gene name on the respective spline plot it will write NA.
The error message when more clusters where specified than hits in a condition was very cryptic. Now, it gives a message that is easy to understand.
A bug in input control that threw an error indicating that there are rows in the dataset that contains just zeros, even though this was not true.
When running cluster_hits() with report = FALSE, but there where many hits, it anyways warned the user that the report generation will take long. This was fixed, because when report = FALSE, no report is generated, which means the user does not have to be warned then (that would just confuse the user).

Changed

Replicates in the double spline plots (limma result category 2 and 3) are now shown with shapes instead of numbers besides the data points. This is now possible because the imputed values are not anymore shown with shapes in those plots, like in the single spline plots (limma result category 1), but with colors (red = imputed values for condition 1, dodgerblue = imputed values for condition 2).
The images in the HTML reports are now considerably smaller, but zoomable, with the option to drag within the zoomed image. A single left-click with the mouse starts the zoom process. Then, you can zoom with the mouse wheel. If you reach the max zoom, you can move in the image by further zooming with the mouse wheel but placing the mouse in one of the corners. Further, one can zoom by holding left-click pressed and dragging (you can then also let go).
SplineOmics can now handle datasets with NA values. limma can do that, and since SplineOmics is based on limma, it can in principle do it to. The issue before was just that there were checks in place that prevented NA values from entering the pipeline, and also other steps in the code caused problems, raised warnings, or stopped the execution when facing NA values.
Changed extract_data() function so that it can handle missing values correctly, and so that it now is fully explicit and does not do any internal magic.
When there where less than two hits in all levels for the cluster_hits() function, then this function provided a message that informed about this and threw an error. This was fine when just running the function to generate one result. However, when using this function in an outside loop to generate multiple results, this behavior is suboptimal, because it then must be captured with a tryCatch clause. Now, if there are less than two hits in all levels, it simply returns NULL, which is a cleaner behavior that does not break loops if one result cannot be generated.
extract_data() function, so that now it does not work with an implicit logic, but instead with an explicit logic. Before, it tried to automatically find the numeric matrix field. Now, it requires to specific the top- and bottom row, and the left and right column. This then marks the field where the matrix is. The rest of the logic (the logic about the row name columns) is still the same.
make_scatter_plot_html() (now renamed to make_scatter_plots_html()).
renamed run_gsea() to run_ora(), because the function uses clusterProfiler, which performs ora (overrepresentation analysis) and not gsea (gene set enrichment analysis) in the stricter sense.

Added

find_pvc() function. This function performs a compount contrast for every timepoint triple in the data (adjacent timepoints) to find peaks, valley, and cliffs as a form of local temporal patterns. It generates an HTML report containing all findings in files and also containing the plots.
New vignette showcasing the find_pvc() function.
Implicit (default) and explicit heteroscedasticity handling of the linear model input. Controlled by the robust_fit Boolean flag indicating whether the robust modeling strategy should be used to account for heteroscedasticity. If omitted (i.e., robust_fit = NULL), the decision is made implicitly based on the result of the Breusch–Pagan test, which is applied independently to each feature. This test fits a linear model of the form expression ~ time and checks whether the residual variance systematically depends on the fitted values. If a significant violation of homoscedasticity is detected in a sufficient fraction of features (default: 10%), the robust strategy is applied. For RNA-seq data, this means using voomWithQualityWeights() instead of the standard voom() function. For other, non-count-based data, limma::arrayWeights() is used during model fitting, combined with robust = TRUE in limma::eBayes(). These strategies downweight samples with higher residual variance to prevent bias in the linear modeling step, as violations of the homoscedasticity assumption can lead to misleading p-values and unreliable inference.
New note in the HTML report of the cluster_hits() function, that informs the user that the splines shown in the plots can appear to have the wrong intercept. This can occur when a batch effect and/or random effect is modeled with limma or the linear mixed models, but the plotting data is batch corrected only with the dedicated limma batch correction function. For a reason, there is a gap. The results are fine! Just the plotting is off!
New user-available function called compare_results() that allows to correlate the topTable results from two SplineOmics results (for example intergrated vs. isolated approach with the same data).

[0.1.2] - 2025-02-11

Added

Level name in title of overall spline plots.

Fixed

Small bug that made it impossible having no treatment label for a condition.
Small bug that did not allow to specify two treatment labels.
mode == isolated for RNA-seq data. Before, it caused an error, because it splits up the meta into the different conditions, but does not do so for the data. Now, it informs the user that it cannot do this, and the user has to split up the data himself outside (just for RNA-seq data).

Removed

open_template() function. This was a function that opened a template for your own analysis. I removed it, because otherwise, whenever I change the code, I have to update the get-started vignette (tutorial) and this, which is twice the maintainance effort.

[0.1.1] - 2025-01-29

Changed

The design formula must now contain the string ‘Time’ rather than ‘X’ like it was before. X from before stood for the time. This change is intended to make the design formula more explicit and self explanatory.
Random effects can now be directly be specified in the design formula, rather than being passed as part of the dream_params.

Added

Added linear mixed models for modeling random effects. The variancePartition package is used for that –> voomWithDreamWeights for RNA-seq data processing, and dream as the replacement for limma. For example, if you Reactor is your random effect, you can write the design formula like this: design <- “~ 1 + Condition*Time + Plate + (1|Reactor)” and SplineOmics will automatically run the variancePartition functions voomWithDreamWeights() and dream() instead of the limma::voom and lmfit. dream() has additional parameters, such as the method and the degree of freedom (different than the degree of freedom used for the splines in this package) and you can pass these with the dream_params argument. See RNA-seq analysis vignette or the respective function references for more info.
Raw data plotting function –> make_scatter_plot_html() –> see references.
Imputed values are marked with triangle symbols in cluster_hits() spline plots if raw data is passed.
Package version is written in each HTML report (the tag) and the session info is added as an embedded text file.
Standard error written on top of “double spline” plots (limma result category 2 and 3).
Used analysis script is embedded as a text file in the reports.
The mode (integrated or isolated) is written on top of the reports in a separate field.
Now there are 4 significance stars.
Treatment lines for the double spline plots (limma result category 2 and 3).

Fixed

A few smaller visual things for the HTML reports.

Session Info

## R version 4.3.3 (2024-02-29)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.5 LTS
## 
## Matrix products: default
## BLAS:   /usr/local/R-4.3.3/lib/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=de_AT.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=de_AT.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=de_AT.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=de_AT.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Europe/Vienna
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices datasets  utils     methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] digest_0.6.37       desc_1.4.3          R6_2.6.1           
##  [4] fastmap_1.2.0       xfun_0.52           cachem_1.1.0       
##  [7] knitr_1.50          htmltools_0.5.8.1   rmarkdown_2.29     
## [10] lifecycle_1.0.4     cli_3.6.5           sass_0.4.10        
## [13] pkgdown_2.1.3       textshaping_1.0.1   jquerylib_0.1.4    
## [16] renv_1.1.4          systemfonts_1.2.3   compiler_4.3.3     
## [19] rstudioapi_0.17.1   tools_4.3.3         ragg_1.4.0         
## [22] bslib_0.9.0         evaluate_1.0.3      yaml_2.3.10        
## [25] BiocManager_1.30.25 jsonlite_2.0.0      htmlwidgets_1.6.4  
## [28] rlang_1.1.6         fs_1.6.6