PREDICT.plotting package¶
Submodules¶
PREDICT.plotting.compute_CI module¶
-
PREDICT.plotting.compute_CI.
compute_confidence
(metric, N_train, N_test, alpha=0.95)¶ Function to calculate the adjusted confidence interval metric: numpy array containing the result for a metric for the different cross validations (e.g. If 20 cross-validations are performed it is a list of length 20 with the calculated accuracy for each cross validation) N_train: Integer, number of training samples N_test: Integer, number of test_samples alpha: float ranging from 0 to 1 to calculate the alpha*100% CI, default 95%
-
PREDICT.plotting.compute_CI.
compute_confidence_logit
(metric, N_train, N_test, alpha=0.95)¶ Function to calculate the adjusted confidence interval metric: numpy array containing the result for a metric for the different cross validations (e.g. If 20 cross-validations are performed it is a list of length 20 with the calculated accuracy for each cross validation) N_train: Integer, number of training samples N_test: Integer, number of test_samples alpha: float ranging from 0 to 1 to calculate the alpha*100% CI, default 95%
PREDICT.plotting.getfeatureimages module¶
-
PREDICT.plotting.getfeatureimages.
gabor_filter
(image, mask, kernel)¶
-
PREDICT.plotting.getfeatureimages.
getfeatureimages
(image, segmentation, gabor_settings=None, image_type=None, parameters=None, types=['LBP'], slicenum=None, save=False)¶
-
PREDICT.plotting.getfeatureimages.
save_LBP_features
(image, mask, output)¶
-
PREDICT.plotting.getfeatureimages.
save_gabor_features
(image, mask, gabor_settings, output, n_jobs=None, backend=None)¶ Apply gabor filters to image, done in parallel. Note: on a cluster, where parallelisation of the gabor filters is not possible, use backend=”threading”
PREDICT.plotting.linstretch module¶
-
PREDICT.plotting.linstretch.
linstretch
(i, i_max=255, i_min=0)¶ Stretch the input image i pixel values from i_min to i_max
PREDICT.plotting.plot_ROC module¶
-
PREDICT.plotting.plot_ROC.
ROC_thresholding
(fprt, tprt, thresholds, nsamples=20)¶ Construct FPR and TPR ratios at different thresholds for the scores of an estimator.
-
PREDICT.plotting.plot_ROC.
main
()¶
-
PREDICT.plotting.plot_ROC.
plot_ROC
(prediction, pinfo, ensemble=1, label_type=None, output_png=None, output_tex=None, output_csv=None)¶
-
PREDICT.plotting.plot_ROC.
plot_ROC_CIc
(y_truth, y_score, N_1, N_2, plot='default', alpha=0.95, verbose=False, DEBUG=False, tsamples=20)¶ Plot a Receiver Operator Characteristic (ROC) curve with confidence intervals.
- tsamples: number of sample points on which to determine the confidence intervals.
- The sample pointsare used on the thresholds for y_score.
-
PREDICT.plotting.plot_ROC.
plot_single_ROC
(y_truth, y_score, verbose=False)¶ Get the False Positive Ratio (FPR) and True Positive Ratio (TPR) for the ground truth and score of a single estimator. These ratios can be used to plot a Receiver Operator Characteristic (ROC) curve.
PREDICT.plotting.plot_SVM module¶
-
PREDICT.plotting.plot_SVM.
main
()¶
-
PREDICT.plotting.plot_SVM.
plot_SVM
(prediction, label_data, label_type, show_plots=False, alpha=0.95, ensemble=False, verbose=True, ensemble_scoring=None, output='stats', modus='singlelabel')¶ Plot the output of a single binary estimator, e.g. a SVM.
- prediction: pandas dataframe or string, mandatory
- output of trainclassifier function, either a pandas dataframe or a HDF5 file
- label_data: string, mandatory
- Contains the path referring to a .txt file containing the patient label(s) and value(s) to be used for learning. See the Github Wiki for the format.
- label_type: string, mandatory
- Name of the label to extract from the label data to test the estimator on.
- show_plots: Boolean, default False
- Determine whether matplotlib performance plots are made.
- alpha: float, default 0.95
- Significance of confidence intervals.
- ensemble: False, integer or ‘Caruana’
- Determine whether an ensemble will be created. If so, either provide an integer to determine how many of the top performing classifiers should be in the ensemble, or use the string “Caruana” to use smart ensembling based on Caruana et al. 2004.
- verbose: boolean, default True
- Plot intermedate messages.
- ensemble_scoring: string, default None
- Metric to be used for evaluating the ensemble. If None, the option set in the prediction object will be used.
- output: string, default stats
- Determine which results are put out. If stats, the statistics of the estimator will be returned. If scores, the scores will be returned.
Depending on the output parameters, the following outputs are returned:
If output == ‘stats’: stats: dictionary
Contains the confidence intervals of the performance metrics and the number of times each patient was classifier correctly or incorrectly.If output == ‘scores’: y_truths: list
Contains the true label for each object.- y_scores: list
- Contains the score (e.g. posterior) for each object.
- y_predictions: list
- Contains the predicted label for each object.
- PIDs: list
- Contains the patient ID/name for each object.
PREDICT.plotting.plot_SVR module¶
-
PREDICT.plotting.plot_SVR.
main
()¶
-
PREDICT.plotting.plot_SVR.
plot_single_SVR
(prediction, mutation_data, label_type, survival=False, show_plots=False, alpha=0.95)¶
PREDICT.plotting.plot_barchart module¶
-
PREDICT.plotting.plot_barchart.
count_parameters
(parameters)¶
-
PREDICT.plotting.plot_barchart.
main
()¶
-
PREDICT.plotting.plot_barchart.
paracheck
(parameters)¶
-
PREDICT.plotting.plot_barchart.
plot_barchart
(prediction, estimators=10, label_type=None, output_tex=None, output_png=None)¶ Make a barchart of the top X hyperparameters settings of the ranked estimators in all cross validation iterations.
- prediction: filepath, mandatory
- Path pointing to the .hdf5 file which was is the output of the trainclassifier function.
- estimators: integer, default 10
- Number of hyperparameter settings/estimators used in each cross validation. The settings are ranked, so when supplying e.g. 10, the best 10 settings in each cross validation setting will be used.
- label_type: string, default None
- The name of the label predicted by the estimator. If None, the first label from the prediction file will be used.
- output_tex: filepath, optional
- If given, the barchart will be written to this tex file.
- output_png: filepath, optional
- If given, the barchart will be written to this png file.
- fig: matplotlib figure
- The figure in which the barchart is plotted.
-
PREDICT.plotting.plot_barchart.
plot_bars
(params, normalization_factor=None, figwidth=20, fontsize=20)¶
PREDICT.plotting.plot_boxplot module¶
-
PREDICT.plotting.plot_boxplot.
generate_boxplots
(image_features, mutation_data, outputfolder)¶ Generate boxplots of the feature values among different objects.
- features: list, mandatory
- List with a dictionary of the feature labels and values for each patient.
- mutation_data: pandas dataframe, mandatory
- Dataframe containing the labels of the objects.
- outputfolder: path, mandatory
- Folder to which the output boxplots should be written.
-
PREDICT.plotting.plot_boxplot.
main
()¶
PREDICT.plotting.plot_images module¶
-
PREDICT.plotting.plot_images.
bbox_2D
(img, mask, padding=[1, 1], img2=None)¶
-
PREDICT.plotting.plot_images.
plot_im_and_overlay
(image, mask, figsize=(3, 3), alpha=0.15)¶ Plot an image in a matplotlib figure and overlay with a mask.
-
PREDICT.plotting.plot_images.
slicer
(image, mask, output_name, output_name_zoom, thresholds=[-240, 160], zoomfactor=4)¶ image and mask should both be arrays
PREDICT.plotting.plot_ranked_scores module¶
-
PREDICT.plotting.plot_ranked_scores.
example
()¶
-
PREDICT.plotting.plot_ranked_scores.
main
()¶
-
PREDICT.plotting.plot_ranked_scores.
plot_ranked_images
(pinfo, label_type, images, segmentations, ranked_truths, ranked_scores, ranked_PIDs, output_zip=None, output_itk=None, zoomfactor=4)¶
-
PREDICT.plotting.plot_ranked_scores.
plot_ranked_percentages
(estimator, pinfo, label_type=None, ensemble=50, output_csv=None)¶
-
PREDICT.plotting.plot_ranked_scores.
plot_ranked_posteriors
(estimator, pinfo, label_type=None, ensemble=50, output_csv=None)¶
-
PREDICT.plotting.plot_ranked_scores.
plot_ranked_scores
(estimator, pinfo, label_type, scores='percentages', images=[], segmentations=[], ensemble=50, output_csv=None, output_zip=None, output_itk=None)¶ Rank the patients according to their average score. The score can either be the average posterior or the percentage of times the patient was classified correctly in the cross validations. Additionally, the middle slice of each patient is plot and saved according to the ranking.
- estimator: filepath, mandatory
- Path pointing to the .hdf5 file which was is the output of the trainclassifier function.
- pinfo: filepath, mandatory
- Path pointint to the .txt file which contains the patient label information.
- label_type: string, default None
- The name of the label predicted by the estimator. If None, the first label from the prediction file will be used.
- scores: string, default percentages
- Type of scoring to be used. Either ‘posteriors’ or ‘percentages’.
- images: list, optional
- List containing the filepaths to the ITKImage image files of the patients.
- segmentations: list, optional
- List containing the filepaths to the ITKImage segmentation files of the patients.
- ensemble: integer or string, optional
- Method to be used for ensembling. Either an integer for a fixed size or ‘Caruana’ for the Caruana method, see the SearchCV function for more details.
- output_csv: filepath, optional
- If given, the scores will be written to this csv file.
- output_zip: filepath, optional
- If given, the images will be plotted and the pngs saved to this zip file.
- output_itk: filepath, optional
- WIP