pygmi.clust.crisp_clust#
Crisp clustering is a set of clustering routines, using standard statistical methods, as opposed to fuzzy methods.
Classes#
Crisp cluster GUI class. |
Functions#
|
G Centroids. |
|
G Dist routine. |
Module Contents#
- class pygmi.clust.crisp_clust.CrispClust(parent=None)#
Bases:
pygmi.misc.BasicModule
Crisp cluster GUI class.
- Parameters:
parent (parent, optional) – Reference to the parent routine. The default is None.
- setupui()#
Set up UI.
- Return type:
None.
- combo()#
Set up combo box to choose algorithm.
- Return type:
None.
- settings(nodialog=False)#
Entry point into item.
- Parameters:
nodialog (bool, optional) – Run settings without a dialog. The default is False.
- Returns:
True if successful, False otherwise.
- Return type:
bool
- saveproj()#
Save project data from class.
- Return type:
None.
- update_vars()#
Update the variables.
- Return type:
None.
- acceptall()#
Process the data.
- Return type:
None.
- crisp_means(data, no_clust, cent, centfix, maxit, term_thresh, cltype, cov_constr)#
Script enables the crisp clustering of COMPLETE multi-variate datasets.
- Parameters:
data (numpy array) – N x P matrix containing the data to be clustered, N is number of samples, P is number of different attributes available for each sample.
no_clust (int) – Number of clusters to be used.
cent (numpy array) – cluster centre positions, either empty [] –> randomly guessed center positions will be used for initialisation or NO_CLUSTxP matrix
centfix (numpy array) – Constrains the position of cluster centers, if CENTFIX is empty, cluster centers can freely vary during cluster analysis, otherwise CENTFIX is of equal size to CENT and gives an absolute deviation from initial center positions that should not be exceeded during clustering. Note, CETNFIX applies only if center values are provided by the user.
maxit (int) – number of maximal allowed iterations.
term_thresh (float) – Termination threshold, either empty [] –> go for the maximum number of iterations MAXIT or a scalar giving the minimum reduction of the size of the objective function for two consecutive iterations in Percent.
cltype (str) – either ‘kmeans’ –> kmeans cluster analysis (spherically shaped cluster), ‘det’ –> uses the determinant criterion of Spath, H., “Cluster-Formation and Analyse, chapter3” (ellipsoidal clusters, all cluster use the same ellipsoid), or ‘vardet’ –> Spath, H., chapter 4 (each cluster uses its individual ellipsoid). Note: the latter is the crisp version of the Gustafson-Kessel algorithm
cov_constr (float) – scalar between [0 1], values > 0 trim the covariance matrix to avoid needle-like ellipsoids for the clusters, applies only for cltype=’vardet’, but must always be provided.
- Returns:
idx (numpy array) – cluster index number for each sample after the last iteration, column vector.
cent (numpy array) – matrix with cluster centre positions after last iteration, one cluster centre per row
obj_fcn (numpy array) – Vector, size of the objective function after each iteration
vrc (numpy array) – Variance Ratio Criterion
- pygmi.clust.crisp_clust.gcentroids(data, index, no_clust, mindist)#
G Centroids.
- Parameters:
data (numpy array) – Input data.
index (numpy array) – Cluster index number for each sample.
no_clust (int) – Number of clusters to be used.
mindist (numpy array) – Minimum distances.
- Returns:
centroids (numpy array) – Centroids
index (numpy array) – Index
- pygmi.clust.crisp_clust.gdist(data, center, index, no_clust, cltype, cov_constr)#
G Dist routine.
- Parameters:
data (numpy array) – Input data.
center (numpy array) – center of each class.
index (numpy array) – Cluster index number for each sample.
no_clust (int) – Number of clusters to be used.
cltype (str) – Clustering type.
cov_constr (float) – scalar between [0 1].
- Returns:
bigd – Output data.
- Return type:
numpy array