autocpd package

Submodules

autocpd.neuralnetwork module

autocpd.neuralnetwork.compile_and_fit(model, x_train, y_train, batch_size, lr, name, log_dir, epochdots, optimizer=None, validation_split=0.2, max_epochs=10000)[source]

To compile and fit the model

Parameters

modelModels object

the simple neural network

x_traintf.Tensor

the tensor of training data

y_traintf.Tensor

the tensor of training data, label

batch_sizeint

the batch size

lrfloat

the learning rate

namestr

the model name

log_dirstr

the path of log files

epochdotsobject

the EpochDots object from tensorflow_docs

optimizeroptimizer object or str, optional

the optimizer, by default None

max_epochsint, optional

the maximum number of epochs, by default 10000

Returns

model.fit object

a fitted model object

autocpd.neuralnetwork.deep_nn(n, n_trans, kernel_size, n_filter, dropout_rate, n_classes, m, l, model_name='deep_nn')[source]

This function is used to construct the deep neural network with 21 residual blocks.

Parameters

nint

the length of time series

n_transint

the number of transformations

kernel_sizeint

the kernel size

n_filterint

the filter size

dropout_ratefloat

the dropout rate

n_classesint

the number of classes

marray

the width vector

lint

the number of dense layers

model_namestr, optional

the model name, by default “deep_nn”

Returns

model

the model of deep neural network

autocpd.neuralnetwork.general_deep_nn(n, n_trans, kernel_size, n_filter, dropout_rate, n_classes, n_resblock, m, l, model_name='deep_nn')[source]

This function is used to construct the deep neural network with 21 residual blocks.

Parameters

nint

the length of time series

n_transint

the number of transformations

kernel_sizeint

the kernel size

n_filterint

the filter size

dropout_ratefloat

the dropout rate

n_classesint

the number of classes

n_resnetint

the number of residual blocks

marray

the width vector

lint

the number of dense layers

model_namestr, optional

the model name, by default “deep_nn”

Returns

model

the model of deep neural network

autocpd.neuralnetwork.general_simple_nn(n, l, m, num_classes, model_name='simple_nn')[source]

To construct a simple neural network.

Parameters

nscalar

the input size

lscalar

the number of hidden layers

mscalar or 1D array

the width vector of hidden layers, if it is a scalar, then the hidden layers of simple neural network have the same nodes.

num_classesscalar

the nodes of output layers, i.e., the number of classes

model_namestr, optional

the model name, by default “simple_nn”

Returns

model

the simple neural network

autocpd.neuralnetwork.get_callbacks(name, log_dir, epochdots)[source]

Get callbacks. This function returns the result of epochs during training, if it satisfies some conditions then the training can stop early. At meanwhile, this function also save the results of training in TensorBoard and csv files.

Parameters

namestr

the model name

log_dirstr

the path of log files

epochdotsobject

the EpochDots object from tensorflow_docs

Returns

list

the list of callbacks

autocpd.neuralnetwork.get_optimizer(learning_rate)[source]

To get the optimizer given the learning rate.

Parameters

learning_ratefloat

the learning rate for inverse time decay schedule.

Returns

optimizer

the Adam

autocpd.neuralnetwork.resblock(x, kernel_size, filters, strides=1)[source]

This function constructs a resblock.

Parameters

xtensor

the input data

kernel_sizeint

the kernel size

filtersint

the filter size

stridesint, optional

the stride, by default 1

Returns

layer

the hidden layer

autocpd.pre_trained_model module

autocpd.pre_trained_model.load_pretrained_model(path)[source]

Load the pretrained model

Parameters

pathstr

the path of pre-trained model

Returns

tf.Model

the pre-trained model.

autocpd.utils module

autocpd.utils.ComputeCUSUM(x)[source]

Compute the CUSUM statistics with O(n) time complexity

Parameters

xvector

the time series

Returns

vector

a: the CUSUM statistics vector.

autocpd.utils.ComputeMeanVarNorm(x, minseglen=2)[source]

Compute the likelihood for change in variance. Rewritten by the R function single.var.norm.calc() in package changepoint.

Parameters

xnumpy array

the time series

minseglenint

the minimum length of segment

Returns

scalar

the likelihood ratio

autocpd.utils.ComputeMosum(x, G)[source]

Compute the mosum statistic, rewritten according to mosum.stat function in mosum R package.

Parameters

xnumpy array

The time series

Gscalar

the width of moving window

Returns

int

the location of maximum mosum statistics

autocpd.utils.DataGenAlternative(N_sub, B, mu_L, n, B_bound, ARcoef=0.0, tau_bound=2, ar_model='Gaussian', scale=0.1, sigma=1.0)[source]

This function genearates the simulation data from alternative model of change in mean.

Parameters

N_subint

The sample size of simulation data.

Bfloat

The signal-to-noise ratio of parameter space.

mu_Lfloat

The single at the left of change point.

nint

The length of time series.

B_boundlist, optional

The upper and lower bound scalars of signal-to-noise.

ARcoeffloat, optional

The autoregressive parameter of AR(1) model, by default 0.0

tau_boundint, optional

The lower bound of change point, by default 2

ar_modelstr, optional

The different models, by default ‘Gaussian’. ar_model=”AR0” means AR(1) noise with autoregressive parameter ‘ARcoef’; ar_model=”ARH” means Cauchy noise with scale parameter ‘scale’; ar_model=”ARrho” means AR(1) noise with random autoregressive parameter ‘scale’;

scalefloat, optional

The scale parameter of Cauchy distribution, by default 0.1

sigmafloat, optional

The standard variance of normal distribution, by default 1.0

Returns

dict

data: size (N_sub,n); tau_alt: size (N_sub,); the change points mu_R: size (N_sub,); the single at the right of change point

autocpd.utils.DataGenScenarios(scenario, N, B, mu_L, n, B_bound, rho, tau_bound)[source]

This function generates the data based on Scenarios 1, a and 3 in “Automatic Change-point Detection in Time Series via Deep Learning” (Jie et al. ,2023)

Parameters

scenariostring

the scenario label: ‘A0’ is the Scenarios 1 with ‘rho=0’, ‘A07’ is the Scenarios 1 with ‘rho=0.7’, ‘C’ is the Scenarios 2 and ‘D’ is the Scenarios 3 with heavy tailed noise.

Nint

the sample size

Bfloat

The signal-to-noise ratio of parameter space.

mu_Lfloat

The single at the left of change point.

nint

The length of time series.

B_boundlist, optional

The upper and lower bound scalars of signal-to-noise.

rhoscalar

the autocorrelation of AR(1) model

tau_boundint, optional

The lower bound of change point, by default 2

Returns

dict

data_all: the time series; y_all: the label array.

autocpd.utils.DataSetGen(N_sub, n, mean_arg, var_arg, slope_arg, n_trim, seed=2022)[source]

This function generates the simulation dataset for change in mean, in variance and change in non-zero slope. For more details, see Table S1 in supplement of “Automatic Change-point Detection in Time Series via Deep Learning” (Jie et al. ,2023)

Parameters

N_subint

the sample size of each class

nint

the length of time series

mean_argarray

the hyperparameters for generating data of change in mean and null

var_argarray

the hyperparameters for generating data of change in variance and null

slope_argarray

the hyperparameters for generating data of change in slope and null

n_trimint

the trim size

seedint, optional

the random seed, by default 2022

Returns

dictionary

the simulation data and corresponding changes

autocpd.utils.ExtractSubject(subject_path, length, size)[source]

To extract the null labels without change-points from one subject

Parameters

subject_pathstring

the path of subject data

lengthint

the length of extracted time series

sizeint

the sample size

Returns

dict

ts: time series; label: the labels.

autocpd.utils.GenDataMean(N, n, cp, mu, sigma)[source]

The function generates the data for change in mean with Gaussian noise. When “cp” is None, it generates the data without change point.

Parameters

Nint

the sample size

nint

the length of time series

cpint

the change point, only 1 change point is accepted in this function.

mufloat

the piecewise mean

sigmafloat

the standard deviation of Gaussian distribution

Returns

numpy array

2D array with size (N, n)

autocpd.utils.GenDataMeanAR(N, n, cp, mu, sigma, coef)[source]

The function generates the data for change in mean with AR(1) noise. When “cp” is None, it generates the data without change point.

Parameters

Nint

the sample size

nint

the length of time series

cpint

the change point, only 1 change point is accepted in this function.

mufloat

the piecewise mean

sigmafloat

the standard deviation of Gaussian innovations in AR(1) noise

coeffloat scalar

the coefficients of AR(1) model

Returns

numpy array

2D array with size (N, n)

autocpd.utils.GenDataMeanARH(N, n, cp, mu, coef, scale)[source]

The function generates the data for change in mean + Cauchy noise with location parameter 0 and scale parameter ‘scale’. When “cp” is None, it generates the data without change point.

Parameters

Nint

the sample size

nint

the length of time series

cpint

the change point, only 1 change point is accepted in this function.

mufloat

the piecewise mean

coeffloat array

the coefficients of AR(1) model

scalethe scale parameter of Cauchy distribution

the coefficients of AR(1) model

Returns

numpy array

2D array with size (N, n)

autocpd.utils.GenDataMeanARrho(N, n, cp, mu, sigma)[source]

The function generates the data for change in mean with AR(1) noise. The autoregressive coefficient is generated from standard uniform distribution. When “cp” is None, it generates the data without change point.

Parameters

Nint

the sample size

nint

the length of time series

cpint

the change point, only 1 change point is accepted in this function.

mufloat

the piecewise mean

sigmafloat

the standard variance of normal distribution

Returns

numpy array

2D array with size (N, n)

autocpd.utils.GenDataSlope(N, n, cp, slopes, sigma, start)[source]

The function generates the data for change in slope with Gaussian noise. When “cp” is None, it generates the data without change point in slope.

Parameters

Nint

the sample size

nint

the length of time series

cpint

the change point, only 1 change point is accepted in this function.

slopesfloat

the slopes before and after the change point

sigmafloat

the standard deviation of Gaussian distribution

startfloat

the y-intercept of linear model

Returns

numpy array

2D array with size (N, n)

autocpd.utils.GenDataVariance(N, n, cp, mu, sigma)[source]

The function generates the data for change in variance with piecewise constant signal. When “cp” is None, it generates the data without change point in variance.

Parameters

Nint

the sample size

nint

the length of time series

cpint

the change point, only 1 change point is accepted in this function.

mufloat

the piecewise mean

sigmafloat

the standard deviation of Gaussian distribution

Returns

numpy array

2D array with size (N, n)

autocpd.utils.GenerateAR(n, coef_left, coef_right, tau, sigma)[source]

This function generates the signal of AR(1) model

Parameters

ninteger

The length of time series

coef_leftfloat

The AR coefficient before the change-point

coef_rightfloat

The AR coefficient after the change-point

tauinteger

The location of change-point

sigmafloat

The standard deviation of noise

Returns

array

The time series with length n.

autocpd.utils.GenerateARAll(N, n, coef_left, coef_right, sigma, tau_bound)[source]

This function generates N the AR(1) signal

Parameters

Ninteger

The number of observations

ninteger

_description_

coef_leftfloat

The AR coefficient before the change-point

coef_rightfloat

The AR coefficient after the change-point

sigmafloat

The standard deviation of noise

tau_boundinteger

The bound of change-point

Returns

2D arrary and change-points

dataset with size (2*N, n), N change-points

autocpd.utils.MaxCUSUM(x)[source]

To return the maximum of CUSUM

Parameters

xvector

the time series

Returns

scalar

the maximum of CUSUM

autocpd.utils.Standardize(data)[source]

Data standardization

Parameters

datanumpy array

the data set with size (N, …, n)

Returns

data

standardized data

autocpd.utils.Transform2D(data_y, rescale=False, cumsum=False)[source]

Apply 4 transformations (original, squared, log squared, tanh) to the same dataset

Parameters

data_ynumpy array

the 2-D array

rescalelogical bool

default False

cusumlogical bool

replace tanh transformation with cusum transformation, default False

Returns

numpy array

3-D arrary with size (N, 4, n)

autocpd.utils.Transform2D2TR(data_y, rescale=False, times=2)[source]

Apply 2 transformations (original, squared) to the same dataset, each transformation is repeated user-specified times.

Parameters

data_ynumpy array

the 2-D array

rescalelogical bool

default False

timesinteger

the number of repetitions

Returns

numpy array

3-D arrary with size (N, 2*times, n)

autocpd.utils.extract(n1, n2, length, size, ntrim)[source]

This function randomly extracts samples (consecutive segments) with length ‘length’ from a time series concatenated by two different time series with length ‘n1’ and ‘n2’ respectively. Argument ‘ntrim’ controls the minimum distance between change-point and start or end point of consecutive segment. It returns a dictionary containing two arrays: cp and sample. cp is an array of change points. sample is a 2D array where each row is the indices of consecutive segment .

Parameters

n1the length of signal before change-point

_description_

n2int

the length of time series after change-point

lengthint

the length of time series segment that we want to extract

sizeint

the sample size

ntrimint

the number of observations to be trimmed before and after the change-point

Returns

dict

‘cp’ is the set of change-points. ‘sample’ is a matrix of indices

autocpd.utils.get_asyvar_window(x, momentp=1)[source]

This function computes the asymptotic variance of long run dependence time series using “window” method. This function is translated from the R function “asymvar.window”. This function is already been tested by letting “overlapping=F”,”obs=”ranks”.

Parameters

x1D array

The time series

momentpint, optional

which centred mean should be used, see Peligrad and Shao (1995) for details, by default 1

Returns

scalar

The asymptotic variance of time series.

autocpd.utils.get_cusum_location(x)[source]

This function return the estimation of change-point location based on CUSUM.

Parameters

xnumpy array

The time series

Returns

int

change-point location

autocpd.utils.get_key(y_pred, label_dict)[source]

To get the labels according to the predict value

Parameters

y_predint

the value of prediction

label_dictdict

the lable dictionary

Returns

list

the label list

autocpd.utils.get_label(model, x_test, n)[source]

This function gets the predicted label for the testing time series:x_test

Parameters

modeltensorflow model

The trained tensorflow model

x_testvector

The vector of time series

nint

The width of moving window

Returns

arrays

two arrays, one is predicted label, the other is probabilities.

autocpd.utils.get_label_hasc(model, x_test, label_dict)[source]

This function gets the predicted label for the HASC data

Parameters

modeltensorflow model

The trained tensorflow model

x_test2D array

The array of test dataset

label_dictdict

The label dictionary

Returns

arrays

two arrays, one is predicted label, the other is probabilities.

autocpd.utils.get_loc_3(model, x_test, n, width)[source]

This function obtains locations of methods: NN, double mosum based on predicted label and probabilities.

Parameters

modelmodel

The trained model

x_testvector

The vector of time series

nint

The length of x_test

widthint

The width of second moving window.

Returns

array

3 locations.

autocpd.utils.get_mosum_loc_double(x, n, width, use_prob)[source]

This function return the estimation of change-point based on MOSUM by second moving average.

Parameters

xarray

either the predicted labels or probabilities

nint

The width of moving window

Returns

int

change-point location

autocpd.utils.get_mosum_loc_nn(pred, n)[source]

This function return the estimation of change-point based on MOSUM using NN.

Parameters

predvector

The vector of predicted labels

nint

The width of moving window

Returns

int

change-point location

autocpd.utils.get_wilcoxon_test(x)[source]

Compute the Wilcoxon statistics

Parameters

xarray

the time series

Returns

scalar

the maximum Wilcoxon statistics

autocpd.utils.labelSubject(subject_path, length, size, num_trim=100)[source]

obtain the transition labels, change-points and time series from one subject.

Parameters

subject_pathstring

the path of subject data

lengthint

the length of extracted time series

sizeint

the sample size

num_trimint, optional

the number of observations to be trimmed before and after the change-point, by default 100

Returns

dictionary

cp: the change-points; ts: time series; label: the transition labels.

autocpd.utils.labelTransition(data, label, ind, length, size, num_trim=100)[source]

get the transition labels, change-points and time series from one subject

Parameters

dataDataFrame

the time series.

labelDataFrame

the states of the subject

indscalar

the index of state

lengthint

the length of extracted time series

sizeint

the sample size

num_trimint, optional

the number of observations to be trimmed before and after the change-point, by default 100

Returns

dictionary

cp: the change-points; ts: time series; label: the transition labels.

autocpd.utils.seqPlot(sequences_list, cp_list, label_list, y_pos=0.93)[source]

This function plots the sequence given change-points and label list.

Parameters

sequences_listDataFrame

the time series

cp_listlist

the list of change-point

label_listlist

the list of labels

y_posfloat, optional

the position of y, used in matplotlib, by default 0.93

autocpd.utils.tsExtract(data_trim, new_label, length, size, len0)[source]

To extract the labels without change-points

Parameters

data_trimDataFrame

the dataset of one specific state

new_labelDataFrame

the label, not transition label.

lengthint

the length of extracted time series

sizeint

the sample size

len0int

the length of time series for one specific state

Returns

dict

ts: time series; label: the labels.

autocpd.utils.wilcoxon(x)[source]

This function implements the Wilcoxon cumulative sum statistic (Dehling et al, 2013, Eq (20)) for nonparametric change point detection. The following code is translated from the C function “wilcoxsukz” in R package “robts”. The accuracy of this function is already been tested.

Parameters

xarray

time series

Returns

1D array

the test statistic for each potential change point.

Module contents