Base Wrapper#

Overview Information Here: boa.wrappers

class boa.wrappers.base_wrapper.BaseWrapper(config_path: Optional[PathLike] = None, config: Optional[dict] = None, setup=True, *args, **kwargs)[source]#

Bases: object

Parameters:

config_path (PathLike) –
config (dict) –

property metric_names#: list of metric names associated with this experiment

property config#

property experiment_dir#

property working_dir#

property output_dir#

classmethod path()[source]#: Path of file that the Wrapper class is defined in

load_config(config_path: PathLike, *args, **kwargs) → dict[source]#

Load config takes a configuration path of either a JSON file or a YAML file and returns your configuration dictionary.

Load_config will (unless overwritten in a subclass), do some basic “normalizations” to your configuration for convenience. See normalize_config() for more information about how the normalization works and what config options you can control.

This implementation offers a default implementation that should work for most JSON or YAML files, but can be overwritten in subclasses if need be.

Parameters:: config_path (PathLike) – File path for the experiment configuration file
Returns:: loaded_config
Return type:: dict

mk_experiment_dir(experiment_dir: PathLike = None, output_dir: PathLike = None, experiment_name: str = None, append_timestamp: bool = None, **kwargs) → Path[source]#

Make the experiment directory that boa will write all of its trials and logs to.

All parameters can be set in your configuration file as well. experiment_dir -> optimization_options -> experiment_dir experiment_name -> optimization_options -> experiment -> name append_timestamp -> script_options -> append_timestamp

Parameters:

experiment_dir (PathLike) – Path to the directory for the output of the experiment You may specify this or output_dir in your configuration file instead. (Defaults to your configuration file and then None)
output_dir (PathLike) – Output directory of project, experiment_dir will be placed inside output dir based on experiment name. Because of this only either experiment_dir or output_dir may be specified. You may specify this or experiment_dir in your configuration file instead. (Defaults to your configuration file and then None, if neither experiment_dir nor output_dir are specified, output_dir defaults to whatever pwd returns (and equivalent on windows))
experiment_name (str) – Name of experiment, used for creating path to experiment dir with the output dir (Defaults to your configuration file and then boa_runs)
append_timestamp (bool) – Whether to append a timestamp to the end of the experiment directory to ensure uniqueness (Defaults to your configuration file and then True)

Return type:

Path

setup(*args, **kwargs)[source]#

method to override for subclasses to run any setup code they need either on class init (which will happen by default unless passing setup=False) or after init by calling this method directly

By default, this method will run mk_experiment_directory, so if you override this method to do more setup, either include that a call to mk_experiment_directory, (the default version or your own implementation) or call super().setup(*args, **kwargs) which will then call the original version, which will call mk_experiment_directory.

write_configs(trial: BaseTrial) → None[source]#

This function is usually used to write out the configurations files used in an individual optimization trial run, or to dynamically write a run script to start an optimization trial run.

Parameters:: trial (BaseTrial) –
Return type:: None

run_model(trial: BaseTrial) → None[source]#

Runs a model by deploying a given trial.

Parameters:: trial (BaseTrial) –
Return type:: None

set_trial_status(trial: BaseTrial) → None[source]#

Marks the status of a trial to reflect the status of the model run for the trial.

Each trial will be polled periodically to determine its status (completed, failed, still running, etc). This function defines the criteria for determining the status of the model run for a trial (e.g., whether the model run is completed/still running, failed, etc). The trial status is updated accordingly when the trial is polled.

The approach for determining the trial status will depend on the structure of the particular model and its outputs. One example is checking the log files of the model.

Todo

Add examples/links of different approaches

Parameters:: trial (BaseTrial) –
Return type:: None

Examples

trial.mark_completed() trial.mark_failed() trial.mark_abandoned() trial.mark_early_stopped()

You can also do:

from ax.core.base_trial import TrialStatus trial.mark_as(TrialStatus.COMPLETED)

or:

trial.mark_as(3) # TrialStatus is an ENUM with COMPLETED being equivalent to 3

Relevant ENUM list

You can set it to either to text version, or the numerical equivalent

Relevant ENUM list	Numerical Equivalent
FAILED	2
COMPLETED	3
RUNNING	4 – you don’t need to set it to running, it is already set to running
ABANDONED	4
EARLY_STOPPED	7

See also

# TODO add sphinx link to ax trial status

fetch_trial_data(trial: BaseTrial, metric_properties: dict, metric_name: str, *args, **kwargs) → dict[source]#

Retrieves the trial data for either the one metric that is specified in metric_name or all metrics at once.

For example, for a case where you are minimizing the error between a model and observations, using RMSE as a metric, this function would load the model output and the corresponding observation data that will be passed to the RMSE metric.

The return value of this function is a dictionary of dictionaries. The keys are the names of the metrics that each dictionary goes to, then each sub dictionary is the key value pair of parameters to pass to those metric functions. If you are just returning one metric, you do not need to return an embedded dictionary, and can just return the dictionary of key value parameter pairs.

In the key value parameter pairs, you can also specify the key “sem” for the standard error for this metric on this trial.

Parameters:

trial (BaseTrial) – The current trial. parameters can be accessed as trial.arm.parameters and trial index can be accessed by trial.index
metric_properties (dict) – collection of all metric properties for all metrics as a nested dictionary. a specific metric properties can be accessed as metric_properties[“metric_name1”]
metric_name (str) – the name of the metric that the arguments are being fetched for if you choose to only return one metric at a time

Returns:

A dictionary with the keys being the name of a specific metric, and the values being a dictionary of key word arguments to pass to that metric function. ex: Mean uses’ np.mean, which expects the parameters a (a array like object), so you could return {“Mean”: {“a”: [1, 2, 3, 4]}} You can also include a key “sem” that is the standard error of the mean for these trials metric value.

example return values

{
    "Mean": {"a": trial.arm.parameters, "sem": 4.5},
    "RMSE": {
        "y_true": [1.12, 1.25, 2.54, 4.52],
        "y_pred": trial.arm.parameters,
    },
}

{"Mean": {"a": trial.arm.parameters}}

{"a": trial.arm.parameters, "sem": 1}

Return type:

dict

Examples

This example returns all the metrics at once. You can imagine instead having a “calc_stuff” for whatever you need to throw into these

>>> def fetch_trial_data(self, trial, metric_properties, metric_name, *args, **kwargs):
...     return {
...         "Mean": {"a": trial.arm.parameters.values(), "sem": 4.5},
...         "RMSE": {
...             "y_true": [1.12, 1.25, 2.54, 4.52],
...             "y_pred": trial.arm.parameters.values(),
...         },
...     }

This one only returns one metric at a time, it has some fragilities in that if you change the name of the metrics in the config, this will break. But for quick and dirty things, this can be great.

>>> def fetch_trial_data(self, trial, metric_properties, metric_name, *args, **kwargs):
...     if metric_name == "Mean":
...         return {"a": trial.arm.parameters.values(), "sem": 4.5}
...     elif metric_name == "RMSE":
...         return {
...             "y_true": [1.12, 1.25, 2.54, 4.52],
...             "y_pred": trial.arm.parameters.values(),
...         }

This one is a little more complicated, but it assumes in your config for each metric, you define a properties section, which allows arbitrary information to be passed. You can then associate a particular metric with a function and lookup that function at runtime in a dictionary (a hashmap if coming from other languages).

>>> def func_a(array):
...     return np.mean(np.exp(array))
...
... def func_b(array):
...     return np.exp(np.mean(array))
...
... funcs = {func_a.__name__: func_a, func_b.__name__: func_b}
...
... def fetch_trial_data(self, trial, metric_properties, metric_name, *args, **kwargs):
...     # we define in our config the names of functions to associate with certain metrics
...     # and look them up at run time
...     return {"a": funcs[metric_properties[metric_name]["function"]](trial.arm.parameters)}

to_dict() → dict[source]#

Convert BaseWrapper to a dictionary.

Return type:: dict

classmethod from_dict(**kwargs)[source]#