PVC_analysis

About this tutorial

This tutorial demonstrates the excursion-analysis capabilities of the SplineOmics package by walking through a complete, real-world phosphoproteomics example from start to finish.

In the context of time series data, an excursion refers to a transient deviation from a baseline — a concept borrowed from signal processing. These short-term, local shifts can reveal biologically meaningful events that are not captured by global trends.

Details of how excursions are detected are described in the section on the PVC test, a method specifically designed to identify such local patterns.

While the core strength of SplineOmics lies in modeling global temporal patterns using smoothing splines, many biologically important phenomena present as localized events. These may not be well captured by standard spline fits due to their smooth and global nature.

For example, in phosphoproteomics data, you might observe a sharp, temporary phosphorylation spike at a specific timepoint following a stimulus — a pattern that returns to baseline shortly afterward. Such a transient peak can indicate a key signaling activation event that is functionally critical, but would be smoothed out or diluted in traditional spline-based analyses.

To handle these scenarios, we developed the PVC test (described below), which is specifically designed to detect such sharp, local excursions in time-resolved omics data.

Note 1

The documentation of all the SplineOmics package functions can be viewed here

Note 2

This vignette focuses on the excursion-analysis. The general functionalities of SplineOmics are explained in the get-started vignette.

The PVC Test

PVC stands for Peak, Valley, and Cliff. The PVC test is a method to identify these distinct local patterns (called excursions) in time series data using a compound contrast approach within the limma framework.

In this context:

A peak occurs when a timepoint is significantly higher than both its immediate neighbors.
A valley occurs when a timepoint is significantly lower than both its neighbors.
A cliff describes a sharp directional shift: when one neighbor is similar in value, but the other is significantly different (higher or lower).

Method

The PVC analysis is implemented as a rolling, compound contrast applied across the time series. Specifically, a window of three consecutive timepoints is evaluated at a time:
(Tᵢ₋₁, Tᵢ, Tᵢ₊₁)
where Tᵢ is the center of the window.

The first and last timepoints are excluded from this analysis, since they lack both left and right neighbors.

For each triplet, a contrast is defined as:

2*Tᵢ - Tᵢ₋₁ - Tᵢ₊₁

This contrast measures whether the center point Tᵢ deviates from both its neighbors in the same direction:

If Tᵢ is greater than both neighbors → large positive contrast → indicates a peak
If Tᵢ is lower than both neighbors → large negative contrast → indicates a valley
If Tᵢ lies between its neighbors → effects cancel → contrast is near zero → no signal

This contrast is tested using limma, and the result is a p-value that indicates whether the observed pattern is statistically significant for each feature (e.g., gene, phosphosite).

Multiple Testing Correction

Because the test is performed independently at each center timepoint across all features, multiple testing correction is applied at the timepoint level. This ensures that significance is not inflated due to repeated testing across features.

Importantly, PVC does not assess whether an entire feature contains a full peak/valley/cliff pattern over time — it evaluates each timepoint independently for such local patterns.

Details about the dataset

The example dataset used here involves a time-series phosphoproteomics experiment, where CHO (chinese hamster ovary) cells were cultivated in three bioreactors (three biological replicates). The experiment includes the following setup:

Samples were taken both during the exponential and stationary growth phases.
Samples were collected in triplicates from each reactor at defined timepoints relative to cell feeding:
- 60 minutes before feeding
- 15, 60, 90, 120, and 240 minutes after feeding

Note that the dataset was truncated to 1000 rows for file size reasons and the annotation info, such as gene name, was randomly shuffled.

Load the packages

library(SplineOmics)
library(here) # For managing filepaths
#> here() starts at /home/thomas/Documents/PhD/projects/DGTX/SplineOmics_hub/SplineOmics
library(dplyr) # For data manipulation
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(knitr) # For Showing the head of the data and the meta tables.

Load the files

In this example, the proteomics_data.rds file contains the numeric values (the intensities) and also the feature descriptions, such as gene and protein name (= annotation part). Usually, you would load the data from for example an Excel file, but the .rds file is more compressed, which is the reason this format was chosen here to limit the size of the SplineOmics package.

The file meta.xlsx contains the meta information, which are the descriptions of the columns of the numeric values of data.

(These example files are part of the package and don’t have to be present on your system).

Please note that this dataset is an actual experimental dataset, but the annotation information, such as gene names, has been removed since it was not yet published at the time of making the SplineOmics package public. Instead, the dataset includes randomly generated gene symbols and gene names corresponding to Cricetulus griseus (Chinese Hamster) for each row. This is intended to demonstrate the functionality of the package.

The left part of data contains the numeric values, and the right part the annotation info, which can be copied in a separate dataframe, as shown below.

data <- readRDS(xzfile(system.file(
  "extdata",
  "phosphoproteomics_data.rds.xz",
  package = "SplineOmics"
)))

meta <- read.csv(
  system.file(
    "extdata",
    "phosphoproteomics_meta.csv",
    package = "SplineOmics"
  ),
  stringsAsFactors = FALSE
)

# Extract the annotation part from the dataframe.
first_na_col <- which(is.na(data[1, ]))[1]
annotation <- data |> dplyr::select((first_na_col + 1):ncol(data))

Show top rows of data

kable(
  head(data),
  format = "markdown"
  )

TP01_10 MaxLFQ Intensity	TP01_12 MaxLFQ Intensity	TP01_9 MaxLFQ Intensity	TP02_10 MaxLFQ Intensity	TP02_12 MaxLFQ Intensity	TP02_9 MaxLFQ Intensity	TP03_10 MaxLFQ Intensity	TP03_12 MaxLFQ Intensity	TP03_9 MaxLFQ Intensity	TP04_10 MaxLFQ Intensity	TP04_12 MaxLFQ Intensity	TP04_9 MaxLFQ Intensity	TP05_10 MaxLFQ Intensity	TP05_12 MaxLFQ Intensity	TP05_9 MaxLFQ Intensity	TP06_10 MaxLFQ Intensity	TP06_12 MaxLFQ Intensity	TP06_9 MaxLFQ Intensity	TP07_10 MaxLFQ Intensity	TP07_12 MaxLFQ Intensity	TP07_9 MaxLFQ Intensity	TP08_10 MaxLFQ Intensity	TP08_12 MaxLFQ Intensity	TP08_9 MaxLFQ Intensity	TP09_10 MaxLFQ Intensity	TP09_12 MaxLFQ Intensity	TP09_9 MaxLFQ Intensity	TP10_10 MaxLFQ Intensity	TP10_12 MaxLFQ Intensity	TP10_9 MaxLFQ Intensity	TP11_10 MaxLFQ Intensity	TP11_12 MaxLFQ Intensity	TP11_9 MaxLFQ Intensity	TP12_10 MaxLFQ Intensity	TP12_12 MaxLFQ Intensity	TP12_9 MaxLFQ Intensity	…37	N..Best.Localization.Probability	T..Protein	T..Index	T..Gene	T..Protein.ID	T..Peptide
18.8809	18.7333	18.736	18.6562	18.767	18.7793	18.6554	18.7401	18.5699	18.5877	18.7569	18.6083	18.7291	18.7104	18.5587	18.7192	18.7356	18.5746	18.7935	18.6219	18.2898	18.324	18.8198	18.4836	16.39	16.5426	18.3209	18.395	19.0248	18.4475	18.4724	18.7529	18.3047	18.5518	18.6099	18.3153	NA	0.9186	tr\|A0A3L7HQ14\|A0A3L7HQ14_CRIGR	A0A3L7HUN2_S117	Top2b	A0A3L7IN31	NsPMAQTPPCHLPNIPGVTSPGTLIEDSK;NsPMAQtPPCHLPNIPGVTSPGTLIEDSK;RNsPMAQTPPCHLPNIPGVTSPGTLIEDSK;RNsPMAQTPPCHLPNIPGVTSPGtLIEDSK;RNsPMAQtPPCHLPNIPGVTSPGTLIEDSK
15.3527	NaN	NaN	NaN	NaN	NaN	15.7263	NaN	NaN	15.4816	15.5787	15.598	NaN	14.9219	NaN	15.5459	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	15.6507	15.3509	NaN	NaN	NaN	NaN	NaN	15.3698	NaN	NaN	NaN	NaN	NA	1.0000	tr\|A0A3L7HYQ7\|A0A3L7HYQ7_CRIGR	A0A061I228_S641	NA	A0A098KXG8	NFQDsPK
16.3484	NaN	NaN	NaN	NaN	16.5751	16.6528	NaN	16.0512	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	16.4891	16.9102	17.0032	16.7853	NaN	NaN	NaN	17.1439	17.163	16.2732	16.5507	17.2955	16.9251	16.5331	17.6113	NaN	16.3595	NaN	17.1642	NA	1.0000	tr\|A0A061IGN9\|A0A061IGN9_CRIGR	A0A3L7GUZ0_S522	Snip1	A0A3L7HFU7	QHLENDPGsNEDTDIPK
16.2122	15.7589	15.8026	15.9001	15.4823	15.7732	16.452	15.884	15.8153	16.1033	16.2624	16.5246	15.773	15.9233	16.1753	15.6983	15.5666	16.251	15.7702	16.2542	16.307	15.5049	15.7808	16.5767	15.6377	16.0753	15.7783	15.6542	15.9858	16.0209	15.6547	16.1474	16.4154	16.1201	16.0911	16.1712	NA	0.9601	tr\|A0A061HXV6\|A0A061HXV6_CRIGR	A0A3L7HQF8_S50	Tomm34	A0A3L7HPC6	ASFITDEEQHAsPR;ASFITDEEQHAsPRPAPQLSR
17.0699	NaN	17.3155	NaN	16.9908	16.9393	17.6292	NaN	16.9251	17.3094	17.4428	17.6153	17.301	NaN	NaN	17.2152	16.9523	NaN	16.1536	NaN	16.2584	15.6612	16.263	16.5353	NaN	NaN	16.0288	NaN	NaN	16.3069	NaN	NaN	NaN	16.3798	16.2436	NaN	NA	0.9561	tr\|A0A8C2L7B4\|A0A8C2L7B4_CRIGR	A0A061IGN9_S52	Pus3	A0A098KXG8	sRSPLHPTNEVPR;sRsPLHPTNEVPR
15.0928	NaN	15.3313	NaN	NaN	15.3163	NaN	15.3458	15.8878	NaN	NaN	NaN	NaN	NaN	15.9505	15.6368	NaN	15.683	15.467	15.0704	15.6614	15.2963	15.7213	15.7003	15.0605	15.4058	16.3211	15.7134	NaN	15.9043	16.0438	15.7761	15.8954	15.5926	15.5007	16.1923	NA	0.9562	tr\|A0A061HY55\|A0A061HY55_CRIGR	A0A061HV22_S13	Ints1	A0A8C2L7B4	GGYLQGNVsGR

Show top rows of meta

kable(
  head(meta),
  format = "markdown"
  )

Sample.ID	Corrected.Sample.ID	Reactor	Time.Point	Phase	Time
TP01_9 MaxLFQ Intensity	1	E09	TP01	Exponential	-60
TP01_10 MaxLFQ Intensity	2	E10	TP01	Exponential	-60
TP01_12 MaxLFQ Intensity	3	E12	TP01	Exponential	-60
TP02_9 MaxLFQ Intensity	4	E09	TP02	Exponential	15
TP02_10 MaxLFQ Intensity	5	E10	TP02	Exponential	15
TP02_12 MaxLFQ Intensity	6	E12	TP02	Exponential	15

Show top rows of annotation

kable(
  head(annotation),
  format = "markdown"
  )

N..Best.Localization.Probability	T..Protein	T..Index	T..Gene	T..Protein.ID	T..Peptide
0.9186	tr\|A0A3L7HQ14\|A0A3L7HQ14_CRIGR	A0A3L7HUN2_S117	Top2b	A0A3L7IN31	NsPMAQTPPCHLPNIPGVTSPGTLIEDSK;NsPMAQtPPCHLPNIPGVTSPGTLIEDSK;RNsPMAQTPPCHLPNIPGVTSPGTLIEDSK;RNsPMAQTPPCHLPNIPGVTSPGtLIEDSK;RNsPMAQtPPCHLPNIPGVTSPGTLIEDSK
1.0000	tr\|A0A3L7HYQ7\|A0A3L7HYQ7_CRIGR	A0A061I228_S641	NA	A0A098KXG8	NFQDsPK
1.0000	tr\|A0A061IGN9\|A0A061IGN9_CRIGR	A0A3L7GUZ0_S522	Snip1	A0A3L7HFU7	QHLENDPGsNEDTDIPK
0.9601	tr\|A0A061HXV6\|A0A061HXV6_CRIGR	A0A3L7HQF8_S50	Tomm34	A0A3L7HPC6	ASFITDEEQHAsPR;ASFITDEEQHAsPRPAPQLSR
0.9561	tr\|A0A8C2L7B4\|A0A8C2L7B4_CRIGR	A0A061IGN9_S52	Pus3	A0A098KXG8	sRSPLHPTNEVPR;sRsPLHPTNEVPR
0.9562	tr\|A0A061HY55\|A0A061HY55_CRIGR	A0A061HV22_S13	Ints1	A0A8C2L7B4	GGYLQGNVsGR

Bring the Inputs into the Standardized Format

data <- SplineOmics::extract_data(
  # The dataframe with the numbers on the left and info on the right.
  data = data,
  # Use this annotation column for the feature names.
  feature_name_columns = c("T..Gene"),
  use_row_index = TRUE,   # makes the feature names unique with row index
  top_row = 1,
  bottom_row = 1000,
  right_col = 36,
  left_col = 1
)

SplineOmics Object

# Those fields are mandatory, because we believe that when such a report is
# opened after half a year, those infos can be very helpful.
report_info <- list(
  omics_data_type = "PPTX",
  data_description = "Phosphoproteomics data of CHO cells",
  data_collection_date = "February 2024",
  analyst_name = "Thomas Rauter",
  contact_info = "thomas.rauter@plus.ac.at",
  project_name = "DGTX"
)

# splineomics now contains the SplineOmics object.
splineomics <- SplineOmics::create_splineomics(
  data = data,
  meta = meta,
  annotation = annotation,
  report_info = report_info,
  condition = "Phase", # Column of meta that contains the levels.
  meta_batch_column = "Reactor", # For batch effect removal
)

Run the PVC-analysis

plot_info <- list(
  y_axis_label = "log2 intensity",
  time_unit = "min",
  treatment_labels = list("feeding"),
  treatment_timepoints = list(0)
)

# Check out the documentation of the function under the Reference tab.
excursion_plots <- SplineOmics::find_pvc(
  splineomics = splineomics,
  alphas = 0.025,
  plot_info = plot_info,
  report_dir = withr::local_tempdir()
)
#> Warning: The data contains missing values (NA). Ensure that imputation or handling of missing values aligns with your analysis requirements. Note that limma can  handle missing values (it just ignores them), and therefore  also SplineOmics can handle them.
#> Warning: Partial NA coefficients for 200 probe(s)
#> 
#> Detected 1 total pattern hits for condition level: Exponential
#> 
#> Summary by pattern type:
#> p: 0, v: 1, b: 0, t: 0
#> 
#> Breakdown by timepoint:
#> -60: p=0; v=0; b=0; t=0
#> 15: p=0; v=0; b=0; t=0
#> 60: p=0; v=0; b=0; t=0
#> 90: p=0; v=1; b=0; t=0
#> 120: p=0; v=0; b=0; t=0
#> 240: p=0; v=0; b=0; t=0
#> Warning: Partial NA coefficients for 205 probe(s)
#> 
#> Detected 12 total pattern hits for condition level: Stationary
#> 
#> Summary by pattern type:
#> p: 8, v: 3, b: 1, t: 0
#> 
#> Breakdown by timepoint:
#> -60: p=0; v=0; b=0; t=0
#> 15: p=0; v=1; b=1; t=0
#> 60: p=8; v=2; b=0; t=0
#> 90: p=0; v=0; b=0; t=0
#> 120: p=0; v=0; b=0; t=0
#> 240: p=0; v=0; b=0; t=0
#> 
#>  Info PVC report generation completed successfully.
#>  Your HTML reports are located in the directory:  /tmp/RtmpBGqV42/filea9153f58d793 .
#>  Please note that due to embedded files, the reports might be flagged as
#>  harmful by other software. Rest assured that they provide no harm.

The terminal output of the function gives information about the total amount of excursions found in each level, and also a breakdown by timepoint and type of excursion (p = peak, v = valley, b = bottom of cliff, t = top of cliff, always for the respective timepoint that is mentioned in that row). Here you can see the resulting HTML report.

Session Info

#> R version 4.5.1 (2025-06-13)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 22.04.5 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0  LAPACK version 3.10.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=de_AT.UTF-8        LC_COLLATE=en_US.UTF-8    
#>  [5] LC_MONETARY=de_AT.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=de_AT.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=de_AT.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: Europe/Vienna
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices datasets  utils     methods   base     
#> 
#> other attached packages:
#> [1] knitr_1.50        dplyr_1.1.4       here_1.0.1        SplineOmics_0.3.1
#> 
#> loaded via a namespace (and not attached):
#>   [1] Rdpack_2.6.4             bitops_1.0-9             pbapply_1.7-2           
#>   [4] writexl_1.5.4            rlang_1.1.6              magrittr_2.0.3          
#>   [7] clue_0.3-66              GetoptLong_1.0.5         matrixStats_1.5.0       
#>  [10] compiler_4.5.1           reshape2_1.4.4           png_0.1-8               
#>  [13] systemfonts_1.2.3        vctrs_0.6.5              stringr_1.5.1           
#>  [16] pkgconfig_2.0.3          shape_1.4.6.1            crayon_1.5.3            
#>  [19] fastmap_1.2.0            backports_1.5.0          labeling_0.4.3          
#>  [22] caTools_1.18.3           rmarkdown_2.29           nloptr_2.2.1            
#>  [25] ragg_1.4.0               purrr_1.1.0              xfun_0.52               
#>  [28] cachem_1.1.0             jsonlite_2.0.0           progress_1.2.3          
#>  [31] EnvStats_3.1.0           remaCor_0.0.18           gmp_0.7-5               
#>  [34] BiocParallel_1.42.1      broom_1.0.8              parallel_4.5.1          
#>  [37] prettyunits_1.2.0        cluster_2.1.8.1          R6_2.6.1                
#>  [40] stringi_1.8.7            bslib_0.9.0              RColorBrewer_1.1-3      
#>  [43] limma_3.64.1             car_3.1-3                boot_1.3-31             
#>  [46] ClusterR_1.3.3           numDeriv_2016.8-1.1      jquerylib_0.1.4         
#>  [49] Rcpp_1.1.0               iterators_1.0.14         base64enc_0.1-3         
#>  [52] IRanges_2.42.0           Matrix_1.7-3             splines_4.5.1           
#>  [55] tidyselect_1.2.1         rstudioapi_0.17.1        abind_1.4-8             
#>  [58] yaml_2.3.10              doParallel_1.0.17        gplots_3.2.0            
#>  [61] codetools_0.2-19         plyr_1.8.9               lmerTest_3.1-3          
#>  [64] lattice_0.22-5           tibble_3.3.0             withr_3.0.2             
#>  [67] Biobase_2.68.0           evaluate_1.0.4           desc_1.4.3              
#>  [70] zip_2.3.3                circlize_0.4.16          pillar_1.11.0           
#>  [73] BiocManager_1.30.26      carData_3.0-5            KernSmooth_2.23-26      
#>  [76] renv_1.1.4               foreach_1.5.2            stats4_4.5.1            
#>  [79] reformulas_0.4.1         generics_0.1.4           rprojroot_2.1.0         
#>  [82] S4Vectors_0.46.0         hms_1.1.3                ggplot2_3.5.2           
#>  [85] scales_1.4.0             aod_1.3.3                minqa_1.2.8             
#>  [88] gtools_3.9.5             RhpcBLASctl_0.23-42      glue_1.8.0              
#>  [91] tools_4.5.1              fANCOVA_0.6-1            variancePartition_1.38.0
#>  [94] lme4_1.1-37              mvtnorm_1.3-3            fs_1.6.6                
#>  [97] grid_4.5.1               tidyr_1.3.1              rbibutils_2.3           
#> [100] colorspace_2.1-1         nlme_3.1-168             Formula_1.2-5           
#> [103] cli_3.6.5                textshaping_1.0.1        svglite_2.2.1           
#> [106] ComplexHeatmap_2.24.1    corpcor_1.6.10           gtable_0.3.6            
#> [109] sass_0.4.10              digest_0.6.37            BiocGenerics_0.54.0     
#> [112] pbkrtest_0.5.4           ggrepel_0.9.6            rjson_0.2.23            
#> [115] htmlwidgets_1.6.4        farver_2.1.2             htmltools_0.5.8.1       
#> [118] pkgdown_2.1.3            lifecycle_1.0.4          GlobalOptions_0.1.2     
#> [121] statmod_1.5.0            MASS_7.3-65

Thomas Rauter

23 April, 2025