metrics
Module¶
This module contains a bunch of evaluation metrics that can be used to evaluate the performance of learners.
author: | Michael Heilman (mheilman@ets.org) |
---|---|
author: | Nitin Madnani (nmadnani@ets.org) |
author: | Dan Blanchard (dblanchard@ets.org) |
organization: | ETS |
-
skll.metrics.
f1_score_least_frequent
(y_true, y_pred)[source]¶ Calculate the F1 score of the least frequent label/class in
y_true
fory_pred
.Parameters: - y_true (array-like of float) – The true/actual/gold labels for the data.
- y_pred (array-like of float) – The predicted/observed labels for the data.
Returns: ret_score – F1 score of the least frequent label.
Return type: float
-
skll.metrics.
kappa
(y_true, y_pred, weights=None, allow_off_by_one=False)[source]¶ Calculates the kappa inter-rater agreement between two the gold standard and the predicted ratings. Potential values range from -1 (representing complete disagreement) to 1 (representing complete agreement). A kappa value of 0 is expected if all agreement is due to chance.
In the course of calculating kappa, all items in
y_true
andy_pred
will first be converted to floats and then rounded to integers.It is assumed that y_true and y_pred contain the complete range of possible ratings.
This function contains a combination of code from yorchopolis’s kappa-stats and Ben Hamner’s Metrics projects on Github.
Parameters: - y_true (array-like of float) – The true/actual/gold labels for the data.
- y_pred (array-like of float) – The predicted/observed labels for the data.
- weights (str or np.array, optional) –
Specifies the weight matrix for the calculation. Options are
- None = unweighted-kappa - 'quadratic' = quadratic-weighted kappa - 'linear' = linear-weighted kappa - two-dimensional numpy array = a custom matrix of
weights. Each weight corresponds to the \(w_{ij}\) values in the wikipedia description of how to calculate weighted Cohen’s kappa. Defaults to None.
- allow_off_by_one (bool, optional) – If true, ratings that are off by one are counted as equal, and all other differences are reduced by one. For example, 1 and 2 will be considered to be equal, whereas 1 and 3 will have a difference of 1 for when building the weights matrix. Defaults to False.
Returns: k – The kappa score, or weighted kappa score.
Return type: float
Raises: AssertionError
– Ify_true
!=y_pred
.ValueError
– If labels cannot be converted to int.ValueError
– If invalid weight scheme.
-
skll.metrics.
kendall_tau
(y_true, y_pred)[source]¶ Calculate Kendall’s tau between
y_true
andy_pred
.Parameters: - y_true (array-like of float) – The true/actual/gold labels for the data.
- y_pred (array-like of float) – The predicted/observed labels for the data.
Returns: ret_score – Kendall’s tau if well-defined, else 0.0
Return type: float
-
skll.metrics.
pearson
(y_true, y_pred)[source]¶ Calculate Pearson product-moment correlation coefficient between
y_true
andy_pred
.Parameters: - y_true (array-like of float) – The true/actual/gold labels for the data.
- y_pred (array-like of float) – The predicted/observed labels for the data.
Returns: ret_score – Pearson product-moment correlation coefficient if well-defined, else 0.0
Return type: float
-
skll.metrics.
spearman
(y_true, y_pred)[source]¶ Calculate Spearman’s rank correlation coefficient between
y_true
andy_pred
.Parameters: - y_true (array-like of float) – The true/actual/gold labels for the data.
- y_pred (array-like of float) – The predicted/observed labels for the data.
Returns: ret_score – Spearman’s rank correlation coefficient if well-defined, else 0.0
Return type: float
-
skll.metrics.
use_score_func
(func_name, y_true, y_pred)[source]¶ Call the scoring function in
sklearn.metrics.SCORERS
with the given name. This takes care of handling keyword arguments that were pre-specified when creating the scorer. This applies any sign-flipping that was specified bymake_scorer()
when the scorer was created.Parameters: - func_name (str) – The name of the objective function to use from SCORERS.
- y_true (array-like of float) – The true/actual/gold labels for the data.
- y_pred (array-like of float) – The predicted/observed labels for the data.
Returns: ret_score – The scored result from the given scorer.
Return type: float