experiments
Module¶
Functions related to running experiments and parsing configuration files.
author: | Dan Blanchard (dblanchard@ets.org) |
---|---|
author: | Michael Heilman (mheilman@ets.org) |
author: | Nitin Madnani (nmadnani@ets.org) |
author: | Chee Wee Leong (cleong@ets.org) |
-
class
skll.experiments.
NumpyTypeEncoder
(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]¶ Bases:
json.encoder.JSONEncoder
This class is used when serializing results, particularly the input label values if the input has int-valued labels. Numpy int64 objects can’t be serialized by the json module, so we must convert them to int objects.
A related issue where this was adapted from: http://stackoverflow.com/questions/11561932/why-does-json-dumpslistnp-arange5-fail-while-json-dumpsnp-arange5-tolis
-
skll.experiments.
run_configuration
(config_file, local=False, overwrite=True, queue='all.q', hosts=None, write_summary=True, quiet=False, ablation=0, resume=False, log_level=20)[source]¶ Takes a configuration file and runs the specified jobs on the grid.
Parameters: - config_file (str) – Path to the configuration file we would like to use.
- local (bool, optional) – Should this be run locally instead of on the cluster?
Defaults to
False
. - overwrite (bool, optional) – If the model files already exist, should we overwrite
them instead of re-using them?
Defaults to
True
. - queue (str, optional) – The DRMAA queue to use if we’re running on the cluster.
Defaults to
'all.q'
. - hosts (list of str, optional) – If running on the cluster, these are the machines we should use.
Defaults to
None
. - write_summary (bool, optional) – Write a TSV file with a summary of the results.
Defaults to
True
. - quite (bool, optional) – Suppress printing of “Loading…” messages.
Defaults to
False
. - ablation (int, optional) – Number of features to remove when doing an ablation
experiment. If positive, we will perform repeated ablation
runs for all combinations of features removing the
specified number at a time. If
None
, we will use all combinations of all lengths. If 0, the default, no ablation is performed. If negative, aValueError
is raised. Defaults to 0. - resume (bool, optional) – If result files already exist for an experiment, do not
overwrite them. This is very useful when doing a large
ablation experiment and part of it crashes.
Defaults to
False
. - log_level (str, optional) – The level for logging messages.
Defaults to
logging.INFO
.
Returns: result_json_paths – A list of paths to .json results files for each variation in the experiment.
Return type: list of str
Raises: ValueError
– If value for"ablation"
is not a positive int orNone
.OSError
– If the lenth of theFeatureSet
name > 210.