Title: | Nonparametric Change Point Detection for Multivariate Time Series |
---|---|
Description: | Implements the nonparametric moving sum procedure for detecting changes in the joint characteristic function (NP-MOJO) for multiple change point detection in multivariate time series. See McGonigle, E. T., Cho, H. (2023) <doi:10.48550/arXiv.2305.07581> for description of the NP-MOJO methodology. |
Authors: | Euan T. McGonigle [aut, cre], Haeran Cho [aut] |
Maintainer: | Euan T. McGonigle <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.2.1 |
Built: | 2025-01-23 05:26:08 UTC |
Source: | https://github.com/euanmcgonigle/cptnonpar |
Merges change point estimators from different lagged values into a final set of overall change point estimators.
multilag.cpts.merge( x.c, eta.merge = 1, merge.type = c("sequential", "bottom-up")[1] )
multilag.cpts.merge( x.c, eta.merge = 1, merge.type = c("sequential", "bottom-up")[1] )
x.c |
A |
eta.merge |
A positive numeric value for the minimal mutual distance of changes, relative to bandwidth, used to merge change point estimators across different lags. |
merge.type |
String indicating the method used to merge change point estimators from different lags. Possible choices are
|
See McGonigle and Cho (2023) for further details.
A list
object which contains the following fields
cpts |
A matrix with rows corresponding to final change point estimators, with estimated change point location and associated lag and p-value given in columns. |
cpt.clusters |
A |
McGonigle, E.T., Cho, H. (2023). Nonparametric data segmentation in multivariate time series via joint characteristic functions. arXiv preprint arXiv:2305.07581.
Messer M., Kirchner M., Schiemann J., Roeper J., Neininger R., Schneider G. (2014). A Multiple Filter Test for the Detection of Rate Changes in Renewal Processes with Varying Variance. The Annals of Applied Statistics, 8(4), 2027-2067.
set.seed(1) n <- 500 noise <- c(rep(1, 300), rep(0.4, 200)) * stats::arima.sim(model = list(ar = 0.3), n = n) signal <- c(rep(0, 100), rep(2, 400)) x <- signal + noise x.c0 <- np.mojo(x, G = 83, lag = 0) x.c1 <- np.mojo(x, G = 83, lag = 1) x.c <- multilag.cpts.merge(list(x.c0, x.c1)) x.c
set.seed(1) n <- 500 noise <- c(rep(1, 300), rep(0.4, 200)) * stats::arima.sim(model = list(ar = 0.3), n = n) signal <- c(rep(0, 100), rep(2, 400)) x <- signal + noise x.c0 <- np.mojo(x, G = 83, lag = 0) x.c1 <- np.mojo(x, G = 83, lag = 1) x.c <- multilag.cpts.merge(list(x.c0, x.c1)) x.c
For a given set of bandwidths and lagged values of the time series, performs multiscale nonparametric change point detection of a possibly multivariate time series.
multiscale.np.mojo( x, G, lags = c(0, 1), kernel.f = c("quad.exp", "gauss", "euclidean", "laplace", "sine")[1], kern.par = 1, data.driven.kern.par = TRUE, threshold = c("bootstrap", "manual")[1], threshold.val = NULL, alpha = 0.1, reps = 199, boot.dep = 1.5 * (nrow(as.matrix(x))^(1/3)), parallel = FALSE, boot.method = c("mean.subtract", "no.mean.subtract")[1], criterion = c("eta", "epsilon", "eta.and.epsilon")[3], eta = 0.4, epsilon = 0.02, use.mean = FALSE, eta.merge = 1, merge.type = c("sequential", "bottom-up")[1], eta.bottom.up = 0.8 )
multiscale.np.mojo( x, G, lags = c(0, 1), kernel.f = c("quad.exp", "gauss", "euclidean", "laplace", "sine")[1], kern.par = 1, data.driven.kern.par = TRUE, threshold = c("bootstrap", "manual")[1], threshold.val = NULL, alpha = 0.1, reps = 199, boot.dep = 1.5 * (nrow(as.matrix(x))^(1/3)), parallel = FALSE, boot.method = c("mean.subtract", "no.mean.subtract")[1], criterion = c("eta", "epsilon", "eta.and.epsilon")[3], eta = 0.4, epsilon = 0.02, use.mean = FALSE, eta.merge = 1, merge.type = c("sequential", "bottom-up")[1], eta.bottom.up = 0.8 )
x |
Input data (a |
G |
A numeric vector containing the moving sum bandwidths;
all values in the vector |
lags |
A |
kernel.f |
String indicating which kernel function to use when calculating the NP-MOJO detector statistics; with
|
kern.par |
The tuning parameter that appears in the expression for the kernel function, which acts as a scaling parameter. |
data.driven.kern.par |
A |
threshold |
String indicating how the threshold is computed. Possible values are
|
threshold.val |
The value of the threshold used to declare change points, only to be used if |
alpha |
a numeric value for the significance level with
|
reps |
An integer value for the number of bootstrap replications performed, if |
boot.dep |
A positive value for the strength of dependence in the multiplier bootstrap sequence, if |
parallel |
A |
boot.method |
A string indicating the method for creating bootstrap replications. It is not recommended to change this. Possible choices are
|
criterion |
String indicating how to determine whether each point
|
eta |
A positive numeric value for the minimal mutual distance of
changes, relative to bandwidth (if |
epsilon |
a numeric value in (0,1] for the minimal size of exceeding
environments, relative to moving sum bandwidth (if |
use.mean |
|
eta.merge |
A positive numeric value for the minimal mutual distance of changes, relative to bandwidth, used to merge change point estimators across different lags. |
merge.type |
String indicating the method used to merge change point estimators from different lags. Possible choices are
|
eta.bottom.up |
A positive numeric value for the minimal mutual distance of changes, relative to bandwidth, for use in bottom-up merging of change point estimators across multiple bandwidths. |
The multi-lag NP-MOJO algorithm for nonparametric change point detection is described in McGonigle, E. T. and Cho, H. (2023) Nonparametric data segmentation in multivariate time series via joint characteristic functions. arXiv preprint arXiv:2305.07581. The multiscale version uses bottom-up merging to combine the results of the multi-lag NP-MOJO algorithm performed over a given set of bandwidths.
A list
object that contains the following fields:
G |
Set of moving window bandwidths |
lags |
Lags used to detect changes |
kernel.f , data.driven.kern.par , use.mean
|
Input parameters |
threshold , alpha , reps , boot.dep , boot.method , parallel
|
Input parameters |
criterion , eta , epsilon
|
Input parameters |
cpts |
A matrix with rows corresponding to final change point estimators, with estimated change point location and associated detection bandwidth, lag and p-value given in columns. |
McGonigle, E.T., Cho, H. (2023). Nonparametric data segmentation in multivariate time series via joint characteristic functions. arXiv preprint arXiv:2305.07581.
Fan, Y., de Micheaux, P.L., Penev, S. and Salopek, D. (2017). Multivariate nonparametric test of independence. Journal of Multivariate Analysis, 153, pp.189-210.
Messer M., Kirchner M., Schiemann J., Roeper J., Neininger R., Schneider G. (2014). A Multiple Filter Test for the Detection of Rate Changes in Renewal Processes with Varying Variance. The Annals of Applied Statistics, 8(4), 2027-2067.
set.seed(1) n <- 500 noise <- c(rep(1, 300), rep(0.4, 200)) * stats::arima.sim(model = list(ar = 0.3), n = n) signal <- c(rep(0, 100), rep(2, 400)) x <- signal + noise x.c <- multiscale.np.mojo(x, G = c(50, 80), lags = c(0, 1)) x.c$cpts
set.seed(1) n <- 500 noise <- c(rep(1, 300), rep(0.4, 200)) * stats::arima.sim(model = list(ar = 0.3), n = n) signal <- c(rep(0, 100), rep(2, 400)) x <- signal + noise x.c <- multiscale.np.mojo(x, G = c(50, 80), lags = c(0, 1)) x.c$cpts
For a given lagged value of the time series, performs nonparametric change point detection of a possibly multivariate
time series. If lag
, then only marginal changes are detected.
If
lag
, then changes in the pairwise distribution of
are detected.
np.mojo( x, G, lag = 0, kernel.f = c("quad.exp", "gauss", "euclidean", "laplace", "sine")[1], kern.par = 1, data.driven.kern.par = TRUE, alpha = 0.1, threshold = c("bootstrap", "manual")[1], threshold.val = NULL, reps = 199, boot.dep = 1.5 * (nrow(as.matrix(x))^(1/3)), parallel = FALSE, boot.method = c("mean.subtract", "no.mean.subtract")[1], criterion = c("eta", "epsilon", "eta.and.epsilon")[3], eta = 0.4, epsilon = 0.02, use.mean = FALSE )
np.mojo( x, G, lag = 0, kernel.f = c("quad.exp", "gauss", "euclidean", "laplace", "sine")[1], kern.par = 1, data.driven.kern.par = TRUE, alpha = 0.1, threshold = c("bootstrap", "manual")[1], threshold.val = NULL, reps = 199, boot.dep = 1.5 * (nrow(as.matrix(x))^(1/3)), parallel = FALSE, boot.method = c("mean.subtract", "no.mean.subtract")[1], criterion = c("eta", "epsilon", "eta.and.epsilon")[3], eta = 0.4, epsilon = 0.02, use.mean = FALSE )
x |
Input data (a |
G |
An integer value for the moving sum bandwidth;
|
lag |
The lagged values of the time series used to detect changes. If |
kernel.f |
String indicating which kernel function to use when calculating the NP-MOJO detectors statistics; with
|
kern.par |
The tuning parameter that appears in the expression for the kernel function, which acts as a scaling parameter,
only to be used if |
data.driven.kern.par |
A |
alpha |
A numeric value for the significance level with
|
threshold |
String indicating how the threshold is computed. Possible values are
|
threshold.val |
The value of the threshold used to declare change points, only to be used if |
reps |
An integer value for the number of bootstrap replications performed, if |
boot.dep |
A positive value for the strength of dependence in the multiplier bootstrap sequence, if |
parallel |
A |
boot.method |
A string indicating the method for creating bootstrap replications. It is not recommended to change this. Possible choices are
|
criterion |
String indicating how to determine whether each point
|
eta |
A positive numeric value for the minimal mutual distance of
changes, relative to bandwidth (if |
epsilon |
a numeric value in (0,1] for the minimal size of exceeding
environments, relative to moving sum bandwidth (if |
use.mean |
|
The single-lag NP-MOJO algorithm for nonparametric change point detection is described in McGonigle, E. T. and Cho, H. (2023) Nonparametric data segmentation in multivariate time series via joint characteristic functions. arXiv preprint arXiv:2305.07581.
A list
object that contains the following fields:
x |
Input data |
G |
Moving window bandwidth |
lag |
Lag used to detect changes |
kernel.f , data.driven.kern.par , use.mean
|
Input parameters |
kern.par |
The value of the kernel tuning parameter |
threshold , alpha , reps , boot.dep , boot.method , parallel
|
Input parameters |
threshold.val |
Threshold value for declaring change points |
criterion , eta , epsilon
|
Input parameters |
test.stat |
A vector containing the NP-MOJO detector statistics computed from the input data |
cpts |
A vector containing the estimated change point locations |
p.vals |
The corresponding p values of the change points, if the bootstrap method was used |
McGonigle, E.T., Cho, H. (2023). Nonparametric data segmentation in multivariate time series via joint characteristic functions. arXiv preprint arXiv:2305.07581.
Fan, Y., de Micheaux, P.L., Penev, S. and Salopek, D. (2017). Multivariate nonparametric test of independence. Journal of Multivariate Analysis, 153, pp.189-210.
set.seed(1) n <- 500 noise <- c(rep(1, 300), rep(0.4, 200)) * stats::arima.sim(model = list(ar = 0.3), n = n) signal <- c(rep(0, 100), rep(2, 400)) x <- signal + noise x.c <- np.mojo(x, G = 83, lag = 0) x.c$cpts x.c$p.vals
set.seed(1) n <- 500 noise <- c(rep(1, 300), rep(0.4, 200)) * stats::arima.sim(model = list(ar = 0.3), n = n) signal <- c(rep(0, 100), rep(2, 400)) x <- signal + noise x.c <- np.mojo(x, G = 83, lag = 0) x.c$cpts x.c$p.vals
For a given set of lagged values of the time series, performs nonparametric change point detection of a possibly multivariate time series.
np.mojo.multilag( x, G, lags = c(0, 1), kernel.f = c("quad.exp", "gauss", "euclidean", "laplace", "sine")[1], kern.par = 1, data.driven.kern.par = TRUE, threshold = c("bootstrap", "manual")[1], threshold.val = NULL, alpha = 0.1, reps = 199, boot.dep = 1.5 * (nrow(as.matrix(x))^(1/3)), parallel = FALSE, boot.method = c("mean.subtract", "no.mean.subtract")[1], criterion = c("eta", "epsilon", "eta.and.epsilon")[3], eta = 0.4, epsilon = 0.02, use.mean = FALSE, eta.merge = 1, merge.type = c("sequential", "bottom-up")[1] )
np.mojo.multilag( x, G, lags = c(0, 1), kernel.f = c("quad.exp", "gauss", "euclidean", "laplace", "sine")[1], kern.par = 1, data.driven.kern.par = TRUE, threshold = c("bootstrap", "manual")[1], threshold.val = NULL, alpha = 0.1, reps = 199, boot.dep = 1.5 * (nrow(as.matrix(x))^(1/3)), parallel = FALSE, boot.method = c("mean.subtract", "no.mean.subtract")[1], criterion = c("eta", "epsilon", "eta.and.epsilon")[3], eta = 0.4, epsilon = 0.02, use.mean = FALSE, eta.merge = 1, merge.type = c("sequential", "bottom-up")[1] )
x |
Input data (a |
G |
An integer value for the moving sum bandwidth;
|
lags |
A |
kernel.f |
String indicating which kernel function to use when calculating the NP-MOJO detector statistics; with
|
kern.par |
The tuning parameter that appears in the expression for the kernel function, which acts as a scaling parameter. |
data.driven.kern.par |
A |
threshold |
String indicating how the threshold is computed. Possible values are
|
threshold.val |
The value of the threshold used to declare change points, only to be used if |
alpha |
a numeric value for the significance level with
|
reps |
An integer value for the number of bootstrap replications performed, if |
boot.dep |
A positive value for the strength of dependence in the multiplier bootstrap sequence, if |
parallel |
A |
boot.method |
A string indicating the method for creating bootstrap replications. It is not recommended to change this. Possible choices are
|
criterion |
String indicating how to determine whether each point
|
eta |
A positive numeric value for the minimal mutual distance of
changes, relative to bandwidth (if |
epsilon |
a numeric value in (0,1] for the minimal size of exceeding
environments, relative to moving sum bandwidth (if |
use.mean |
|
eta.merge |
A positive numeric value for the minimal mutual distance of changes, relative to bandwidth, used to merge change point estimators across different lags. |
merge.type |
String indicating the method used to merge change point estimators from different lags. Possible choices are
|
The multi-lag NP-MOJO algorithm for nonparametric change point detection is described in McGonigle, E. T. and Cho, H. (2023) Nonparametric data segmentation in multivariate time series via joint characteristic functions. arXiv preprint arXiv:2305.07581.
A list
object that contains the following fields:
G |
Moving window bandwidth |
lags |
Lags used to detect changes |
kernel.f , data.driven.kern.par , use.mean
|
Input parameters |
threshold , alpha , reps , boot.dep , boot.method , parallel
|
Input parameters |
criterion , eta , epsilon
|
Input parameters |
cpts |
A matrix with rows corresponding to final change point estimators, with estimated change point location and associated lag and p-value given in columns. |
cpt.clusters |
A |
McGonigle, E.T., Cho, H. (2023). Nonparametric data segmentation in multivariate time series via joint characteristic functions. arXiv preprint arXiv:2305.07581.
Fan, Y., de Micheaux, P.L., Penev, S. and Salopek, D. (2017). Multivariate nonparametric test of independence. Journal of Multivariate Analysis, 153, pp.189-210.
Messer M., Kirchner M., Schiemann J., Roeper J., Neininger R., Schneider G. (2014). A Multiple Filter Test for the Detection of Rate Changes in Renewal Processes with Varying Variance. The Annals of Applied Statistics, 8(4), 2027-2067.
set.seed(1) n <- 500 noise <- c(rep(1, 300), rep(0.4, 200)) * stats::arima.sim(model = list(ar = 0.3), n = n) signal <- c(rep(0, 100), rep(2, 400)) x <- signal + noise x.c <- np.mojo.multilag(x, G = 83, lags = c(0, 1)) x.c$cpts x.c$cpt.clusters
set.seed(1) n <- 500 noise <- c(rep(1, 300), rep(0.4, 200)) * stats::arima.sim(model = list(ar = 0.3), n = n) signal <- c(rep(0, 100), rep(2, 400)) x <- signal + noise x.c <- np.mojo.multilag(x, G = 83, lags = c(0, 1)) x.c$cpts x.c$cpt.clusters