Module PdmContext.utils.distances
Functions
def PCA_pre(context1, seriesnames)
def build_2D_array(seriesnames, context1)
def calculate_3d_fft(x, fft_size)
def calculate_jaccard(a, context1, context2)
def common_values_calc(context1, context2)
def distance_3D_sbd_jaccard(context1: Context, context2: Context, a, verbose=False)
-
Calculation of similarity between two Context objects based on two quantities: 1) The first quantity is based on the 3d sbd distance upon all context data. 2) Jaccard similarity of the edges in the CR (if we ignore the direction)
context1: A context object
context2: A context object
a: the weight of SBD similarity
verbose:
return: a similarity between 0 and 1 , and a tuple with both 3D SBD and jaccard similarity
def distance_PCA_jaccard(context1: Context, context2: Context, a, seriesnames, precalc=None, verbose=False)
-
Calculation of similarity between two Context objects based on two quantities: 1) The first quantity is based on the singular values from PCA. 2) Jaccard similarity of the edges in the CR (if we ignore the direction)
This method requires prior knowledge of the existence of all available sources in the context.
Parameters:
context1: A context object
context2: A context object
a: the weight of SBD similarity
seriesnames: A list of all names from available sources in the context.
precalc: If this is not None, then each time a pca fit is called, singular values are stored in details of Context in order to not be calculated next time.
verbose:
return: a similarity between 0 and 1 , and a tuple with both PCA and jaccard similarity
def distance_cc(context1: Context, context2: Context, a, verbose=False)
-
Calculation of similarity between two Context objects based on two quantities: 1) The first quantity is based on the sbd distance We calculate the minimum (average) sbd between all common series in the CD of contexts, from all possible shifts. The shifts apply to all series each time. Each time we use the last n values (where n is the size of the shorter series) Which is also weighted from the ratio of common values. 2) Jaccard similarity of the edges in the CR (if we ignore the direction)
context1: A context object
context2: A context object
a: the weight of SBD similarity
verbose:
return: a similarity between 0 and 1 , and a tuple with both pair-wise SBD and jaccard similarity
def distance_eu_z(context1: Context, context2: Context, a, verbose=False)
-
Calculation of similarity between two Context objects based on two quantities: 1) The first quantity is based on the Euclidean distance after z_normalization We calculate a similarity based on the Euclidean distance between common values in the context CD, equal to Euclidean(c1,c2)/(norm(c1)+norm(c2) to be in [0,1] where each time we use the last n values (where n is the size of the shorter series) 2) Jaccard similarity of the edges in the CR (if we ignore the direction)
context1: A context object
context2: A context object
a: the weight of Euclidean similarity
verbose:
return: a similarity between 0 and 1 , and a tuple with both z-norm and jaccard similarity
def get_precalculated_fft(seriesnames, fftsize, context1, common_values)
def ignore_order(context1: Context)
def ignore_order_list(edgeslist1)
def jaccard_CR(context1, context2)
def jaccard_distance_CR(context1, context2)
def nearest(TargetSet: list[Context], query: Context, threshold: float, distance)
-
This method searches if there is a similar context object as query in the TargetSet. Where the similar means with similarity at least as threshold
Parameters:
TargetSet: A list from context objects to search for similar ones
query : The query context object
threshold : The similarity threshold (real value in [0,1]
def np_pearson_cor(x, y)
def sbd_3d(common_values, uncommon_values, context1, context2, verbose=False)