Skip to contents

About this tutorial

This tutorial demonstrates the excursion-analysis capabilities of the SplineOmics package by walking through a complete, real-world phosphoproteomics example from start to finish.

In the context of time series data, an excursion refers to a transient deviation from a baseline — a concept borrowed from signal processing. These short-term, local shifts can reveal biologically meaningful events that are not captured by global trends.

Details of how excursions are detected are described in the section on the PVC test, a method specifically designed to identify such local patterns.

While the core strength of SplineOmics lies in modeling global temporal patterns using smoothing splines, many biologically important phenomena present as localized events. These may not be well captured by standard spline fits due to their smooth and global nature.

For example, in phosphoproteomics data, you might observe a sharp, temporary phosphorylation spike at a specific timepoint following a stimulus — a pattern that returns to baseline shortly afterward. Such a transient peak can indicate a key signaling activation event that is functionally critical, but would be smoothed out or diluted in traditional spline-based analyses.

To handle these scenarios, we developed the PVC test (described below), which is specifically designed to detect such sharp, local excursions in time-resolved omics data.

Note 1

The documentation of all the SplineOmics package functions can be viewed here

Note 2

This vignette focuses on the excursion-analysis. The general functionalities of SplineOmics are explained in the get-started vignette.

The PVC Test

PVC stands for Peak, Valley, and Cliff. The PVC test is a method to identify these distinct local patterns (called excursions) in time series data using a compound contrast approach within the limma framework.

In this context:

  • A peak occurs when a timepoint is significantly higher than both its immediate neighbors.
  • A valley occurs when a timepoint is significantly lower than both its neighbors.
  • A cliff describes a sharp directional shift: when one neighbor is similar in value, but the other is significantly different (higher or lower).

Method

The PVC analysis is implemented as a rolling, compound contrast applied across the time series. Specifically, a window of three consecutive timepoints is evaluated at a time:
(Tᵢ₋₁, Tᵢ, Tᵢ₊₁)
where Tᵢ is the center of the window.

The first and last timepoints are excluded from this analysis, since they lack both left and right neighbors.

For each triplet, a contrast is defined as:

2*Tᵢ - Tᵢ₋₁ - Tᵢ₊₁

This contrast measures whether the center point Tᵢ deviates from both its neighbors in the same direction:

  • If Tᵢ is greater than both neighbors → large positive contrast → indicates a peak
  • If Tᵢ is lower than both neighbors → large negative contrast → indicates a valley
  • If Tᵢ lies between its neighbors → effects cancel → contrast is near zero → no signal

This contrast is tested using limma, and the result is a p-value that indicates whether the observed pattern is statistically significant for each feature (e.g., gene, phosphosite).

Multiple Testing Correction

Because the test is performed independently at each center timepoint across all features, multiple testing correction is applied at the timepoint level. This ensures that significance is not inflated due to repeated testing across features.

Importantly, PVC does not assess whether an entire feature contains a full peak/valley/cliff pattern over time — it evaluates each timepoint independently for such local patterns.

Details about the dataset

The example dataset used here involves a time-series phosphoproteomics experiment, where CHO (chinese hamster ovary) cells were cultivated in three bioreactors (three biological replicates). The experiment includes the following setup:

  • Samples were taken both during the exponential and stationary growth phases.
  • Samples were collected in triplicates from each reactor at defined timepoints relative to cell feeding:
    • 60 minutes before feeding
    • 15, 60, 90, 120, and 240 minutes after feeding

Note that the dataset was truncated to 1000 rows for file size reasons and the annotation info, such as gene name, was randomly shuffled.

Load the packages

library(SplineOmics)
library(here) # For managing filepaths
#> here() starts at /home/thomas/Documents/PhD/projects/DGTX/SplineOmics_hub/SplineOmics
library(dplyr) # For data manipulation
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(knitr) # For Showing the head of the data and the meta tables.

Load the files

In this example, the proteomics_data.rds file contains the numeric values (the intensities) and also the feature descriptions, such as gene and protein name (= annotation part). Usually, you would load the data from for example an Excel file, but the .rds file is more compressed, which is the reason this format was chosen here to limit the size of the SplineOmics package.

The file meta.xlsx contains the meta information, which are the descriptions of the columns of the numeric values of data.

(These example files are part of the package and don’t have to be present on your system).

Please note that this dataset is an actual experimental dataset, but the annotation information, such as gene names, has been removed since it was not yet published at the time of making the SplineOmics package public. Instead, the dataset includes randomly generated gene symbols and gene names corresponding to Cricetulus griseus (Chinese Hamster) for each row. This is intended to demonstrate the functionality of the package.

The left part of data contains the numeric values, and the right part the annotation info, which can be copied in a separate dataframe, as shown below.

data <- readRDS(xzfile(system.file(
  "extdata",
  "phosphoproteomics_data.rds.xz",
  package = "SplineOmics"
)))

meta <- read.csv(
  system.file(
    "extdata",
    "phosphoproteomics_meta.csv",
    package = "SplineOmics"
  ),
  stringsAsFactors = FALSE
)

# Extract the annotation part from the dataframe.
first_na_col <- which(is.na(data[1, ]))[1]
annotation <- data |> dplyr::select((first_na_col + 1):ncol(data)) 

Show top rows of data

kable(
  head(data),
  format = "markdown"
  )
TP01_10 MaxLFQ Intensity TP01_12 MaxLFQ Intensity TP01_9 MaxLFQ Intensity TP02_10 MaxLFQ Intensity TP02_12 MaxLFQ Intensity TP02_9 MaxLFQ Intensity TP03_10 MaxLFQ Intensity TP03_12 MaxLFQ Intensity TP03_9 MaxLFQ Intensity TP04_10 MaxLFQ Intensity TP04_12 MaxLFQ Intensity TP04_9 MaxLFQ Intensity TP05_10 MaxLFQ Intensity TP05_12 MaxLFQ Intensity TP05_9 MaxLFQ Intensity TP06_10 MaxLFQ Intensity TP06_12 MaxLFQ Intensity TP06_9 MaxLFQ Intensity TP07_10 MaxLFQ Intensity TP07_12 MaxLFQ Intensity TP07_9 MaxLFQ Intensity TP08_10 MaxLFQ Intensity TP08_12 MaxLFQ Intensity TP08_9 MaxLFQ Intensity TP09_10 MaxLFQ Intensity TP09_12 MaxLFQ Intensity TP09_9 MaxLFQ Intensity TP10_10 MaxLFQ Intensity TP10_12 MaxLFQ Intensity TP10_9 MaxLFQ Intensity TP11_10 MaxLFQ Intensity TP11_12 MaxLFQ Intensity TP11_9 MaxLFQ Intensity TP12_10 MaxLFQ Intensity TP12_12 MaxLFQ Intensity TP12_9 MaxLFQ Intensity …37 N..Best.Localization.Probability T..Protein T..Index T..Gene T..Protein.ID T..Peptide
18.8809 18.7333 18.736 18.6562 18.767 18.7793 18.6554 18.7401 18.5699 18.5877 18.7569 18.6083 18.7291 18.7104 18.5587 18.7192 18.7356 18.5746 18.7935 18.6219 18.2898 18.324 18.8198 18.4836 16.39 16.5426 18.3209 18.395 19.0248 18.4475 18.4724 18.7529 18.3047 18.5518 18.6099 18.3153 NA 0.9186 tr|A0A3L7HQ14|A0A3L7HQ14_CRIGR A0A3L7HUN2_S117 Top2b A0A3L7IN31 NsPMAQTPPCHLPNIPGVTSPGTLIEDSK;NsPMAQtPPCHLPNIPGVTSPGTLIEDSK;RNsPMAQTPPCHLPNIPGVTSPGTLIEDSK;RNsPMAQTPPCHLPNIPGVTSPGtLIEDSK;RNsPMAQtPPCHLPNIPGVTSPGTLIEDSK
15.3527 NaN NaN NaN NaN NaN 15.7263 NaN NaN 15.4816 15.5787 15.598 NaN 14.9219 NaN 15.5459 NaN NaN NaN NaN NaN NaN NaN NaN 15.6507 15.3509 NaN NaN NaN NaN NaN 15.3698 NaN NaN NaN NaN NA 1.0000 tr|A0A3L7HYQ7|A0A3L7HYQ7_CRIGR A0A061I228_S641 NA A0A098KXG8 NFQDsPK
16.3484 NaN NaN NaN NaN 16.5751 16.6528 NaN 16.0512 NaN NaN NaN NaN NaN NaN NaN NaN 16.4891 16.9102 17.0032 16.7853 NaN NaN NaN 17.1439 17.163 16.2732 16.5507 17.2955 16.9251 16.5331 17.6113 NaN 16.3595 NaN 17.1642 NA 1.0000 tr|A0A061IGN9|A0A061IGN9_CRIGR A0A3L7GUZ0_S522 Snip1 A0A3L7HFU7 QHLENDPGsNEDTDIPK
16.2122 15.7589 15.8026 15.9001 15.4823 15.7732 16.452 15.884 15.8153 16.1033 16.2624 16.5246 15.773 15.9233 16.1753 15.6983 15.5666 16.251 15.7702 16.2542 16.307 15.5049 15.7808 16.5767 15.6377 16.0753 15.7783 15.6542 15.9858 16.0209 15.6547 16.1474 16.4154 16.1201 16.0911 16.1712 NA 0.9601 tr|A0A061HXV6|A0A061HXV6_CRIGR A0A3L7HQF8_S50 Tomm34 A0A3L7HPC6 ASFITDEEQHAsPR;ASFITDEEQHAsPRPAPQLSR
17.0699 NaN 17.3155 NaN 16.9908 16.9393 17.6292 NaN 16.9251 17.3094 17.4428 17.6153 17.301 NaN NaN 17.2152 16.9523 NaN 16.1536 NaN 16.2584 15.6612 16.263 16.5353 NaN NaN 16.0288 NaN NaN 16.3069 NaN NaN NaN 16.3798 16.2436 NaN NA 0.9561 tr|A0A8C2L7B4|A0A8C2L7B4_CRIGR A0A061IGN9_S52 Pus3 A0A098KXG8 sRSPLHPTNEVPR;sRsPLHPTNEVPR
15.0928 NaN 15.3313 NaN NaN 15.3163 NaN 15.3458 15.8878 NaN NaN NaN NaN NaN 15.9505 15.6368 NaN 15.683 15.467 15.0704 15.6614 15.2963 15.7213 15.7003 15.0605 15.4058 16.3211 15.7134 NaN 15.9043 16.0438 15.7761 15.8954 15.5926 15.5007 16.1923 NA 0.9562 tr|A0A061HY55|A0A061HY55_CRIGR A0A061HV22_S13 Ints1 A0A8C2L7B4 GGYLQGNVsGR

Show top rows of meta

kable(
  head(meta),
  format = "markdown"
  )
Sample.ID Corrected.Sample.ID Reactor Time.Point Phase Time
TP01_9 MaxLFQ Intensity 1 E09 TP01 Exponential -60
TP01_10 MaxLFQ Intensity 2 E10 TP01 Exponential -60
TP01_12 MaxLFQ Intensity 3 E12 TP01 Exponential -60
TP02_9 MaxLFQ Intensity 4 E09 TP02 Exponential 15
TP02_10 MaxLFQ Intensity 5 E10 TP02 Exponential 15
TP02_12 MaxLFQ Intensity 6 E12 TP02 Exponential 15

Show top rows of annotation

kable(
  head(annotation),
  format = "markdown"
  )
N..Best.Localization.Probability T..Protein T..Index T..Gene T..Protein.ID T..Peptide
0.9186 tr|A0A3L7HQ14|A0A3L7HQ14_CRIGR A0A3L7HUN2_S117 Top2b A0A3L7IN31 NsPMAQTPPCHLPNIPGVTSPGTLIEDSK;NsPMAQtPPCHLPNIPGVTSPGTLIEDSK;RNsPMAQTPPCHLPNIPGVTSPGTLIEDSK;RNsPMAQTPPCHLPNIPGVTSPGtLIEDSK;RNsPMAQtPPCHLPNIPGVTSPGTLIEDSK
1.0000 tr|A0A3L7HYQ7|A0A3L7HYQ7_CRIGR A0A061I228_S641 NA A0A098KXG8 NFQDsPK
1.0000 tr|A0A061IGN9|A0A061IGN9_CRIGR A0A3L7GUZ0_S522 Snip1 A0A3L7HFU7 QHLENDPGsNEDTDIPK
0.9601 tr|A0A061HXV6|A0A061HXV6_CRIGR A0A3L7HQF8_S50 Tomm34 A0A3L7HPC6 ASFITDEEQHAsPR;ASFITDEEQHAsPRPAPQLSR
0.9561 tr|A0A8C2L7B4|A0A8C2L7B4_CRIGR A0A061IGN9_S52 Pus3 A0A098KXG8 sRSPLHPTNEVPR;sRsPLHPTNEVPR
0.9562 tr|A0A061HY55|A0A061HY55_CRIGR A0A061HV22_S13 Ints1 A0A8C2L7B4 GGYLQGNVsGR

Bring the Inputs into the Standardized Format

data <- SplineOmics::extract_data(
  # The dataframe with the numbers on the left and info on the right.
  data = data,
  # Use this annotation column for the feature names.
  feature_name_columns = c("T..Gene"),
  use_row_index = TRUE,   # makes the feature names unique with row index
  top_row = 1,
  bottom_row = 1000,
  right_col = 36,
  left_col = 1
)

SplineOmics Object

# Those fields are mandatory, because we believe that when such a report is
# opened after half a year, those infos can be very helpful.
report_info <- list(
  omics_data_type = "PPTX",
  data_description = "Phosphoproteomics data of CHO cells",
  data_collection_date = "February 2024",
  analyst_name = "Thomas Rauter",
  contact_info = "thomas.rauter@plus.ac.at",
  project_name = "DGTX"
)
# splineomics now contains the SplineOmics object.
splineomics <- SplineOmics::create_splineomics(
  data = data,
  meta = meta,
  annotation = annotation,
  report_info = report_info,
  condition = "Phase", # Column of meta that contains the levels.
  meta_batch_column = "Reactor", # For batch effect removal
)

Run the PVC-analysis

plot_info = list(
  y_axis_label = "log2 intensity",
  time_unit = "min",
  treatment_labels = list("feeding"),
  treatment_timepoints = list(0)
)
# Check out the documentation of the function under the Reference tab.
excursion_plots <- SplineOmics::find_pvc(
  splineomics = splineomics,
  alphas = 0.025,
  plot_info = plot_info,
  report_dir = here::here("results", "pvc_report")
)
#> Warning: The data contains missing values (NA). Ensure that imputation or handling of missing values aligns with your analysis requirements. Note that limma can  handle missing values (it just ignores them), and therefore  also SplineOmics can handle them.
#> Column 'Reactor' of meta will be used to remove the batch effect for the plotting
#> Warning: Partial NA coefficients for 200 probe(s)
#> 
#> Detected 1 total pattern hits for condition level: Exponential
#> 
#> Summary by pattern type:
#> p: 0, v: 1, b: 0, t: 0
#> 
#> Breakdown by timepoint:
#> -60: p=0; v=0; b=0; t=0
#> 15: p=0; v=0; b=0; t=0
#> 60: p=0; v=0; b=0; t=0
#> 90: p=0; v=1; b=0; t=0
#> 120: p=0; v=0; b=0; t=0
#> 240: p=0; v=0; b=0; t=0
#> Warning: Partial NA coefficients for 205 probe(s)
#> 
#> Detected 10 total pattern hits for condition level: Stationary
#> 
#> Summary by pattern type:
#> p: 8, v: 2, b: 0, t: 0
#> 
#> Breakdown by timepoint:
#> -60: p=0; v=0; b=0; t=0
#> 15: p=0; v=0; b=0; t=0
#> 60: p=8; v=2; b=0; t=0
#> 90: p=0; v=0; b=0; t=0
#> 120: p=0; v=0; b=0; t=0
#> 240: p=0; v=0; b=0; t=0
#> Info PVC report generation completed successfully.
#>  Your HTML reports are located in the directory:  /home/thomas/Documents/PhD/projects/DGTX/SplineOmics_hub/SplineOmics/results/pvc_report .
#>  Please note that due to embedded files, the reports might be flagged as
#>  harmful by other software. Rest assured that they provide no harm.

The terminal output of the function gives information about the total amount of excursions found in each level, and also a breakdown by timepoint and type of excursion (p = peak, v = valley, b = bottom of cliff, t = top of cliff, always for the respective timepoint that is mentioned in that row). Here you can see the resulting HTML report.

Session Info

#> R version 4.3.3 (2024-02-29)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 22.04.5 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/local/R-4.3.3/lib/R/lib/libRblas.so 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=de_AT.UTF-8        LC_COLLATE=en_US.UTF-8    
#>  [5] LC_MONETARY=de_AT.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=de_AT.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=de_AT.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: Europe/Vienna
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices datasets  utils     methods   base     
#> 
#> other attached packages:
#> [1] knitr_1.50        dplyr_1.1.4       here_1.0.1        SplineOmics_0.2.0
#> 
#> loaded via a namespace (and not attached):
#>   [1] Rdpack_2.6.4             bitops_1.0-9             gridExtra_2.3           
#>   [4] rlang_1.1.6              magrittr_2.0.3           clue_0.3-66             
#>   [7] GetoptLong_1.0.5         matrixStats_1.5.0        compiler_4.3.3          
#>  [10] reshape2_1.4.4           png_0.1-8                systemfonts_1.2.3       
#>  [13] vctrs_0.6.5              stringr_1.5.1            pkgconfig_2.0.3         
#>  [16] shape_1.4.6.1            crayon_1.5.3             fastmap_1.2.0           
#>  [19] backports_1.5.0          labeling_0.4.3           caTools_1.18.3          
#>  [22] rmarkdown_2.29           nloptr_2.2.1             ragg_1.4.0              
#>  [25] purrr_1.0.4              xfun_0.52                cachem_1.1.0            
#>  [28] jsonlite_2.0.0           progress_1.2.3           EnvStats_3.1.0          
#>  [31] remaCor_0.0.18           BiocParallel_1.36.0      broom_1.0.8             
#>  [34] parallel_4.3.3           prettyunits_1.2.0        cluster_2.1.8.1         
#>  [37] R6_2.6.1                 bslib_0.9.0              stringi_1.8.7           
#>  [40] RColorBrewer_1.1-3       limma_3.58.1             boot_1.3-31             
#>  [43] car_3.1-3                numDeriv_2016.8-1.1      jquerylib_0.1.4         
#>  [46] Rcpp_1.0.14              iterators_1.0.14         base64enc_0.1-3         
#>  [49] IRanges_2.36.0           Matrix_1.6-5             splines_4.3.3           
#>  [52] tidyselect_1.2.1         rstudioapi_0.17.1        abind_1.4-8             
#>  [55] yaml_2.3.10              viridis_0.6.5            doParallel_1.0.17       
#>  [58] gplots_3.2.0             codetools_0.2-20         plyr_1.8.9              
#>  [61] lmerTest_3.1-3           lattice_0.22-7           tibble_3.2.1            
#>  [64] withr_3.0.2              Biobase_2.62.0           evaluate_1.0.3          
#>  [67] desc_1.4.3               zip_2.3.3                circlize_0.4.16         
#>  [70] pillar_1.10.2            BiocManager_1.30.25      carData_3.0-5           
#>  [73] KernSmooth_2.23-26       renv_1.1.4               foreach_1.5.2           
#>  [76] stats4_4.3.3             reformulas_0.4.1         generics_0.1.4          
#>  [79] rprojroot_2.0.4          S4Vectors_0.40.2         hms_1.1.3               
#>  [82] ggplot2_3.5.2            scales_1.4.0             aod_1.3.3               
#>  [85] minqa_1.2.8              gtools_3.9.5             RhpcBLASctl_0.23-42     
#>  [88] glue_1.8.0               pheatmap_1.0.12          tools_4.3.3             
#>  [91] fANCOVA_0.6-1            dendextend_1.19.0        variancePartition_1.32.5
#>  [94] lme4_1.1-37              openxlsx_4.2.8           mvtnorm_1.3-3           
#>  [97] fs_1.6.6                 grid_4.3.3               tidyr_1.3.1             
#> [100] rbibutils_2.3            colorspace_2.1-1         nlme_3.1-168            
#> [103] patchwork_1.3.0          Formula_1.2-5            cli_3.6.5               
#> [106] textshaping_1.0.1        viridisLite_0.4.2        svglite_2.2.1           
#> [109] ComplexHeatmap_2.18.0    corpcor_1.6.10           gtable_0.3.6            
#> [112] sass_0.4.10              digest_0.6.37            BiocGenerics_0.48.1     
#> [115] pbkrtest_0.5.4           ggrepel_0.9.6            rjson_0.2.23            
#> [118] htmlwidgets_1.6.4        farver_2.1.2             htmltools_0.5.8.1       
#> [121] pkgdown_2.1.3            lifecycle_1.0.4          GlobalOptions_0.1.2     
#> [124] statmod_1.5.0            MASS_7.3-60.0.1