Creating model wrappers#

Language Agnostic Interface#

The language agnostic interface revolves around you writing 1 or more script(s) (detailed below) and BOA passing a command line argument to this 1 or more script(s) which is the path to a folder with json files that contain optimization information, such as the parameters for a trial.

You have a few options on the number of scripts to write. The options are listed below:

  • Write Configs: this script is supposed to let you write out any configurations or setup stuff your model may need to run. It is called right before the Run Model Script to let you do set up stuff. (This script is optional, and much or all of this can be accomplished in your Run Model script, but is provided to allow code partitioning options)

  • Run Model: This is the script that actually runs your model codel. This can be directly in this script, or maybe this script kicks off another script that is your model. For example maybe this script is your model script, but maybe you write a convenience wrapper in R to interface with a Fortran model, this could kick off your R interface code.

  • Set Trial Status: This script writes out to BOA if your trial passed or failed. (This script is optional, and much or all of this can be accomplished in your Run Model Script or your Fetch Trial Data Script, but is provided to allow code partitioning options)

  • Fetch Trial Data: This script writes out your data for BOA to consume. (This script is optional, and much or all of this can be accomplished in your Run Model Script, but is provided to allow code partitioning options)

To run any of these scripts, in your config file, you will add command line run commands (as in what you would type in your terminal (bash, zsh, powershell, windows command prompt, etc.) to start your script. This might be something like Rscript run_model.R or python run_model.py). Keep in mind that BOA will run this command directly, so if you use relative paths (such as in the example in the previous sentence), then the working directory by default will be the directory that your config file is in. To add these run commands to your configuration file, add them to the following sections

script_options:
    write_configs: whatever your write_configs run command is  # only include `write_configs` if you are using a `Write Configs Script`
    run_model: whatever your run_model run command is
    set_trial_status: whatever your set_trial_status run command is  # only include `set_trial_status` if you are using a `Set Trial Status Script`
    fetch_trial_data: whatever your fetch_trial_data run command is  # only include `fetch_trial_data` if you are using a `Fetch Trial Data Script`

For examples on the formatting of the json files you will output back to BOA, see ScriptWrapper.fetch_trial_data() and ScriptWrapper.set_trial_status()

Here is an example of a Run Model Script that handles setting the trial status and outputting the data back to BOA as well. So this script is all that is needed (other than the model itself, which in this case is a synthetic function called hartman6, but that is just a stand in for any black box model call).

# load in any libraries and modules we need
library(jsonlite)
source("../r_utils/hartman6.R")

# This is where we read in from BOA the command line argument.
# If in your script, you use any other command line arguments,
# generally BOA's trial_dir should be the last command line arugment,
# so taking the last one should generally be safe.
args <- commandArgs(trailingOnly=TRUE)
trial_dir <- args[length(args)]

# this this trial_dir folder there are 2 files supplied by BOA,
# a parameters.json that has just the parameters, and a trial.json
# that includes the parameters and a lot more in case you need it.
# Most people will only need the parameters.json
param_path <- file.path(trial_dir, "parameters.json")
data <- read_json(path=param_path)

# The parameter keys config from whatever you  named them in your
# config file, which you are free to change.
x0 <- data$x0
x1 <- data$x1
x2 <- data$x2
x3 <- data$x3
x4 <- data$x4
x5 <- data$x5
X <- c(x0, x1, x2, x3, x4, x5)

# This is where we actually run our "model".
# Here we are using a synthetic function called hartman6
# But you could substitute it for your own model in
# a number of ways.
res <- hartman6(X)

# In this case, we directly ran the model, so we are getting back a number
# or nan, so we know if it succeeded or failed. If you are submitting a job
# to an HPC (a super computer) queue, this might work, or you might have to
# rely on another method. Other options could be relying on log file output
# or information from querying the queue itself,
# though those may be better as stand alone `Set Trial Status Scripts`
if (!is.na(res)) {

    # if it was a success, we don't even need to write out trial status,
    # it is assumed a success if we write out data and don't fail
    out_data <- list(
        metric=res
        # trial_status=unbox("COMPLETED")  #  this is optional if it succeeds
    )

} else {

    # If we fail, then we do need to include a trial status, and mark it as failed.
    out_data <- list(
        trial_status=unbox("FAILED")
    )
}

json_data <- toJSON(out_data, pretty = TRUE)
write(json_data, file.path(trial_dir, "output.json"))

You can also partition the logic of your code into different steps if that is more appropriate for you project. Below is an example that does that

First we see the Write Configs Script

# load in any libraries and modules we need
library(jsonlite)

# This is where we read in from BOA the command line argument.
# If in your script, you use any other command line arguments,
# generally BOA's trial_dir should be the last command line arugment,
# so taking the last one should generally be safe.
args <- commandArgs(trailingOnly=TRUE)
trial_dir <- args[length(args)]

# this this trial_dir folder there are 2 files supplied by BOA,
# a parameters.json that has just the parameters, and a trial.json
# that includes the parameters and a lot more in case you need it.
# Most people will only need the parameters.json
param_path <- file.path(trial_dir, "parameters.json")
data <- read_json(path=param_path)

# The parameter keys config from whatever you  named them in your
# config file, which you are free to change.
x0 <- data$x0
x1 <- data$x1
x2 <- data$x2
x3 <- data$x3
x4 <- data$x4
x5 <- data$x5

# We write out are data to another file for our run_model.R script
# in a real use case, this could be writing out the parameters
# to whatever config file your model needs whether it is
# json, a text file, yaml, toml, netcdf, or anything else.
# this just allows you to partition the conversion steps from
# BOA parameters.json to what your model will read
X <- c(x0, x1, x2, x3, x4, x5)
df <- data.frame(X)
write.csv(df, file.path(trial_dir, "data.csv"), row.names = FALSE)

Then the Run Model Script

library(jsonlite)
source("../r_utils/hartman6.R")

args <- commandArgs(trailingOnly=TRUE)
trial_dir <- args[length(args)]

# Here we read the data we wrote out from our write_configs.R script
df <- read.csv(file = file.path(trial_dir, "data.csv"))
X <- df[[1]]

# This is where we actually run our "model".
# Here we are using a synthetic function called hartman6
# But you could substitute it for your own model in
# a number of ways.
res <- hartman6(X)

# Here we write out the data to wherever we want to store it,
# in this case we use the trial_dir
out_data <- list(
    output=unbox(res)
)
json_data <- toJSON(out_data, pretty = TRUE)
write(json_data, file.path(trial_dir, "model_data.json"))

Then the Set Trial Status Script

library(jsonlite)

args <- commandArgs(trailingOnly=TRUE)
trial_dir <- args[length(args)]


# We read in the data we saved before.
# If it is just your own output data, you can probably put this step
# at the end of your Run Model Scipt. But if you don't know when
# your model finishes because it is running on an HPC,
# you could replace this with checks of the queue, or reading the log
# file for certain statuses, or other techniques.
data <- read_json(path=file.path(trial_dir, "model_data.json"))
is_passed <- (!is.na(data$output))

if (is_passed) {
    # If we passed, we write out the trial status as COMPLETE
    trial_status <- "COMPLETED"
    out_data <- list(
        trial_status=unbox(trial_status)
    )

} else {
    # If we failed, we write out the trial status as FAILED
    out_data <- list(
        trial_status=unbox("FAILED")
    )
}

# save to a trial_status.json file in the triad_dir
json_data <- toJSON(out_data, pretty = TRUE)
write(json_data, file.path(trial_dir, "trial_status.json"))

Finally, the Fetch Trial Status Script

library(jsonlite)

BRANIN_TRUE_MIN <- 0.397887

args <- commandArgs(trailingOnly=TRUE)
trial_dir <- args[length(args)]

# Read in our data
model_data <- read_json(path=file.path(trial_dir, "model_data.json"))
res <- model_data$output

# format and save our data to a format that BOA is expecting
out_data <- list(
    metric=list(
        a=res
    )
)

json_data <- toJSON(out_data, pretty = TRUE)
write(json_data, file.path(trial_dir, "output.json"))

Python Interface#

To create a model wrapper, you will create a child class of BOA’s BaseWrapper class. BaseWrapper defines the core functions that must be defined in your model wrapper:

Apart from these core functions, your model wrapper can have additional functions as needed (for example, to help with formatting or scaling model outputs)

See FETCH3’s Wrapper for an example.

Example wrapper functions#

The BaseWrapper.write_configs() function#

This function is used to write out the configuration files used in an individual optimization trial run, (i.e. your model’s configuration files) or to dynamically write a run script to start an optimization trial run.

This function is how boa gives a new set of parameters for your model to run during each trial.

FETCH3’s wrapper provides a simple example of this function for a case where the model’s parameters simply need to be written to a yaml file:

    def write_configs(self, trial: Trial) -> None:
        """
        Write model configuration file for a trial (model run). This is the config file used by FETCH3
        for the model run.

        The config file is written as ```config.yml``` inside the trial directory.

        Parameters
        ----------
        trial: Trial
            The trial to deploy.

        Returns
        -------
        str
            Path for the config file
        """
        trial_dir = make_trial_dir(self.experiment_dir, trial.index)
        config_dict = self.config.boa_params_to_wpr(trial.arm.parameters, self.config.mapping)
        config_dict["model_options"] = self.model_settings

        logging.info(pformat(config_dict))

        if self._model_trees:
            for model_tree, parameters in config_dict["model_trees"].items():
                parameters["parents"] = self._model_trees[model_tree]

        with open(trial_dir / self.config_file_name, "w") as f:
            # Write model options from loaded config
            # Parameters for the trial from Ax
            yaml.dump(config_dict, f)
            return f.name

The palm_wrapper used to wrap PALM provides an example of a model with more complicated configuration requirements. Here, the parameters are written to a YAML file, but then a batch job script must also be written for each optimization trial run.

    def write_configs(self, trial: Trial) -> None:
        """
        This function is usually used to write out the configurations files used
        in an individual optimization trial run, or to dynamically write a run
        script to start an optimization trial run.

        Parameters
        ----------
        trial : BaseTrial
        """
        trial_config = copy.deepcopy(self.config)
        job_name = zfilled_trial_index(trial.index)
        job_output_dir = self.experiment_dir / job_name
        job_output_dir.mkdir(parents=True)

        run_time = trial_config["model_options"]["output_end_time"] * trial_config["model_options"][
            "palmrun_walltime_scalar"]
        data_analyses_time = (
                (trial_config["model_options"]["output_end_time"] - trial_config["model_options"][
                    "output_start_time"])
                * trial_config["model_options"]["data_analyse_walltime_scalar"])
        trial_config_path = job_output_dir / "trial_config.yaml"
        job_script_path = (job_output_dir / "slurm_job.sh").resolve()

        trial_config["parameters"] = trial.arm.parameters
        trial_config["model_options"]["config_path"] = str(trial_config_path)
        trial_config["model_options"]["job_script_path"] = str(job_script_path)
        trial_config["model_options"]["job_name"] = job_name
        trial_config["model_options"]["log_file"] = str(job_output_dir / f"{job_name}_%j.log")
        trial_config["model_options"]["job_output_dir"] = str(job_output_dir)
        trial_config["model_options"]["run_time"] = int(run_time)
        trial_config["model_options"]["data_analyses_time"] = data_analyses_time
        trial_config["model_options"]["batch_time"] = int(run_time + data_analyses_time)

        self.paths_by_trial[trial.index] = dict(trial_config_path=trial_config_path,
                                                job_script_path=job_script_path)

        with open(JOB_SCRIPT_PATH) as template:
            job_script = template.read()
        jinja_env = jinja2.Environment(
            loader=jinja2.BaseLoader(),
        )
        template = jinja_env.from_string(job_script)
        job_script = template.render(**trial_config["model_options"])

        with open(job_script_path, "w") as f:
            f.write(job_script)

        with open(trial_config_path, 'w') as f:
            yaml.dump(trial_config, f)

The BaseWrapper.run_model() function#

This function defines how to start a run of your model. In most cases, it can be as simple as launching a python or shell script to start a model run.

FETCH3’s wrapper provides an example of this function for the case where the model run is started by running a python script with command line arguments:

    def run_model(self, trial: Trial):

        trial_dir = get_trial_dir(self.experiment_dir, trial.index)
        config_path = trial_dir / self.config_file_name

        # model_dir = self.model_settings["model_dir"]

        # os.chdir(model_dir)

        cmd = self.script_options.run_model.format(config_path=config_path,
                                                    data_path=self.model_settings['data_path'],
                                                    trial_dir=trial_dir)

        args = cmd.split()
        popen = subprocess.Popen(args, stdout=subprocess.PIPE, universal_newlines=True)
        self._processes.append(popen)

The palm_wrapper used to wrap PALM takes the batch job script written in write_configs and runs it, starting a job on an HPC. The job script also utilizes the YAML file written above as well.

    def run_model(self, trial: Trial):
        """
        Runs a model by deploying a given trial.

        Parameters
        ----------
        trial : BaseTrial
        """
        trial_config = load_yaml(self.paths_by_trial[trial.index]["trial_config_path"], normalize=False)

        job_script_path = trial_config["model_options"]["job_script_path"]
        cmd = f"sbatch {job_script_path}"

        args = cmd.split()
        subprocess.Popen(
            args, stdout=subprocess.PIPE, universal_newlines=True
        )

The BaseWrapper.set_trial_status() function#

Marks the status of a trial to reflect the status of the model run associated with that trial.

This function defines the criteria for determining the status of the model run for a trial (e.g., whether the model run is completed/still running, failed, etc). Each trial will be polled periodically to determine its status.

The approach for determining the trial status will depend on the structure of the particular model and its outputs. One way to do this is checking the log files of the model.

In these two examples, the trial status is determined by checking the log file of the model for specific outputs:

    def set_trial_status(self, trial: Trial, log_file='fetch3.log') -> None:
        """ "Get status of the job by a given ID. For simplicity of the example,
        return an Ax `TrialStatus`.
        """
        log_file = get_trial_dir(self.experiment_dir, trial.index) / log_file

        if log_file.exists():
            with open(log_file, "r") as f:
                contents = f.read()
            if "Error completing Run! Reason:" in contents:
                trial.mark_failed()
            elif "run complete" in contents:
                trial.mark_completed()
    def set_trial_status(self, trial: Trial) -> None:
        """
        The trial gets polled from time to time to see if it is completed, failed, still running,
        etc. This marks the trial as one of those options based on some criteria of the model.
        If the model is still running, don't do anything with the trial.

        Parameters
        ----------
        trial : BaseTrial

        Examples
        --------
        trial.mark_completed()
        trial.mark_failed()
        trial.mark_abandoned()
        trial.mark_early_stopped()

        See Also
        --------
        # TODO add sphinx link to ax trial status

        """
        trial_config = load_yaml(self.paths_by_trial[trial.index]["trial_config_path"], normalize=False)

        log_file = Path(trial_config["model_options"]["log_file"])

        if log_file.exists():
            with open(log_file, "r") as f:
                contents = f.read()
            if "palmrun crashed" in contents:
                trial.mark_abandoned()
            elif "error:" in contents:
                trial.mark_failed()
            if "all OUTPUT-files saved" in contents:
                trial.mark_completed()

The BaseWrapper.fetch_trial_data() function#

Retrieves the trial data (i.e., model outputs) and prepares it for the metric(s) used in the objective function. The return value needs to be a dictionary with the keys matching the keys of the metric function used in the objective function.

    def fetch_trial_data(self, trial: Trial, metric_properties: dict, metric_name: str, *args, **kwargs):

        modelfile = (
            get_trial_dir(self.experiment_dir, trial.index) / metric_properties[metric_name]["output_fname"]
        )

        fetch_data_func = self.fetch_data_funcs[metric_properties[metric_name]["fetch_data_func"]]

        y_pred, y_true = fetch_data_func(
            modelfile,
            **metric_properties[metric_name]
        )
        return dict(y_pred=y_pred, y_true=y_true)
    def fetch_trial_data(self, trial: Trial, metric_properties: dict, metric_name: str,  *args, **kwargs):
        """
        Retrieves the trial data and prepares it for the metric(s) used in the objective
        function.

        For example, for a case where you are minimizing the error between a model and observations, using RMSE as a
        metric, this function would load the model output and the corresponding observation data that will be passed to
        the RMSE metric.

        The return value of this function is a dictionary, with keys that match the keys
        of the metric used in the objective function.
        # TODO work on this description

        Parameters
        ----------
        trial : Trial
        metric_properties: dict
        metric_name: str

        Returns
        -------
        dict
            A dictionary with the keys matching the keys of the metric function
                used in the objective
        """
        trial_config = trial.run_metadata["trial_config_path"]
        job_output_dir = trial_config["model_options"]["job_output_dir"]
        data_filepath = job_output_dir / "output.json"

        with open(data_filepath, 'r') as f:
            data = json.load(f)
        output = np.array(data["output"])
        return dict(a=output)

Full Examples#

class Fetch3Wrapper(BaseWrapper):
    _processes = []
    config_file_name = "config.yml"
    fetch_data_funcs = {get_model_sapflux.__name__: get_model_sapflux,
                        get_model_plot_trans.__name__: get_model_plot_trans,
                        get_model_swc.__name__: get_model_swc,
                        get_model_nhl_trans.__name__: get_model_nhl_trans,
                        }

    def __init__(self, *args, **kwargs):
        self._model_trees = {}
        print(args, kwargs)
        super().__init__(*args, **kwargs)

    def load_config(self, config_path, *args, **kwargs):
        """
        Load config takes a configuration path of either a JSON file or a YAML file and returns
        your configuration dictionary.

        Load_config will (unless overwritten in a subclass), do some basic "normalizations"
        to your configuration for convenience. See :func:`.normalize_config`
        for more information about how the normalization works and what config options you
        can control.

        This implementation offers a default implementation that should work for most JSON or YAML
        files, but can be overwritten in subclasses if need be.

        Parameters
        ----------
        config_path
            File path for the experiment configuration file

        Returns
        -------
        BOAConfig
            loaded_config
        """
        config = load_jsonlike(config_path)

        if "model_trees" in config:
            parameter_keys = [["groups", key] for key in config.get("groups", {}).keys()]
            parameter_keys.extend([["model_trees", tree] for tree in config["model_trees"].keys()])
            for model_tree, parameters in config["model_trees"].items():
                self._model_trees[model_tree] = parameters.pop("parents", None)
        elif "species_parameters" in config:
            parameter_keys = [["species_parameters", key] for key in config.get("species_parameters", {}).keys()]
            parameter_keys.append(["site_parameters"])
        else:
            raise ValueError("No model trees or species parameters found in config file")

        self.config = BOAConfig(parameter_keys=parameter_keys, **config)
        return self.config

    def write_configs(self, trial: Trial) -> None:
        """
        Write model configuration file for a trial (model run). This is the config file used by FETCH3
        for the model run.

        The config file is written as ```config.yml``` inside the trial directory.

        Parameters
        ----------
        trial: Trial
            The trial to deploy.

        Returns
        -------
        str
            Path for the config file
        """
        trial_dir = make_trial_dir(self.experiment_dir, trial.index)
        config_dict = self.config.boa_params_to_wpr(trial.arm.parameters, self.config.mapping)
        config_dict["model_options"] = self.model_settings

        logging.info(pformat(config_dict))

        if self._model_trees:
            for model_tree, parameters in config_dict["model_trees"].items():
                parameters["parents"] = self._model_trees[model_tree]

        with open(trial_dir / self.config_file_name, "w") as f:
            # Write model options from loaded config
            # Parameters for the trial from Ax
            yaml.dump(config_dict, f)
            return f.name

    def run_model(self, trial: Trial):

        trial_dir = get_trial_dir(self.experiment_dir, trial.index)
        config_path = trial_dir / self.config_file_name

        # model_dir = self.model_settings["model_dir"]

        # os.chdir(model_dir)

        cmd = self.script_options.run_model.format(config_path=config_path,
                                                    data_path=self.model_settings['data_path'],
                                                    trial_dir=trial_dir)

        args = cmd.split()
        popen = subprocess.Popen(args, stdout=subprocess.PIPE, universal_newlines=True)
        self._processes.append(popen)

    def set_trial_status(self, trial: Trial, log_file='fetch3.log') -> None:
        """ "Get status of the job by a given ID. For simplicity of the example,
        return an Ax `TrialStatus`.
        """
        log_file = get_trial_dir(self.experiment_dir, trial.index) / log_file

        if log_file.exists():
            with open(log_file, "r") as f:
                contents = f.read()
            if "Error completing Run! Reason:" in contents:
                trial.mark_failed()
            elif "run complete" in contents:
                trial.mark_completed()

    def fetch_trial_data(self, trial: Trial, metric_properties: dict, metric_name: str, *args, **kwargs):

        modelfile = (
            get_trial_dir(self.experiment_dir, trial.index) / metric_properties[metric_name]["output_fname"]
        )

        fetch_data_func = self.fetch_data_funcs[metric_properties[metric_name]["fetch_data_func"]]

        y_pred, y_true = fetch_data_func(
            modelfile,
            **metric_properties[metric_name]
        )
        return dict(y_pred=y_pred, y_true=y_true)

link to source: jemissik/fetch3_nhl

class Wrapper(BaseWrapper):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.paths_by_trial = {}
    def write_configs(self, trial: Trial) -> None:
        """
        This function is usually used to write out the configurations files used
        in an individual optimization trial run, or to dynamically write a run
        script to start an optimization trial run.

        Parameters
        ----------
        trial : BaseTrial
        """
        trial_config = copy.deepcopy(self.config)
        job_name = zfilled_trial_index(trial.index)
        job_output_dir = self.experiment_dir / job_name
        job_output_dir.mkdir(parents=True)

        run_time = trial_config["model_options"]["output_end_time"] * trial_config["model_options"][
            "palmrun_walltime_scalar"]
        data_analyses_time = (
                (trial_config["model_options"]["output_end_time"] - trial_config["model_options"][
                    "output_start_time"])
                * trial_config["model_options"]["data_analyse_walltime_scalar"])
        trial_config_path = job_output_dir / "trial_config.yaml"
        job_script_path = (job_output_dir / "slurm_job.sh").resolve()

        trial_config["parameters"] = trial.arm.parameters
        trial_config["model_options"]["config_path"] = str(trial_config_path)
        trial_config["model_options"]["job_script_path"] = str(job_script_path)
        trial_config["model_options"]["job_name"] = job_name
        trial_config["model_options"]["log_file"] = str(job_output_dir / f"{job_name}_%j.log")
        trial_config["model_options"]["job_output_dir"] = str(job_output_dir)
        trial_config["model_options"]["run_time"] = int(run_time)
        trial_config["model_options"]["data_analyses_time"] = data_analyses_time
        trial_config["model_options"]["batch_time"] = int(run_time + data_analyses_time)

        self.paths_by_trial[trial.index] = dict(trial_config_path=trial_config_path,
                                                job_script_path=job_script_path)

        with open(JOB_SCRIPT_PATH) as template:
            job_script = template.read()
        jinja_env = jinja2.Environment(
            loader=jinja2.BaseLoader(),
        )
        template = jinja_env.from_string(job_script)
        job_script = template.render(**trial_config["model_options"])

        with open(job_script_path, "w") as f:
            f.write(job_script)

        with open(trial_config_path, 'w') as f:
            yaml.dump(trial_config, f)

    def run_model(self, trial: Trial):
        """
        Runs a model by deploying a given trial.

        Parameters
        ----------
        trial : BaseTrial
        """
        trial_config = load_yaml(self.paths_by_trial[trial.index]["trial_config_path"], normalize=False)

        job_script_path = trial_config["model_options"]["job_script_path"]
        cmd = f"sbatch {job_script_path}"

        args = cmd.split()
        subprocess.Popen(
            args, stdout=subprocess.PIPE, universal_newlines=True
        )

    def set_trial_status(self, trial: Trial) -> None:
        """
        The trial gets polled from time to time to see if it is completed, failed, still running,
        etc. This marks the trial as one of those options based on some criteria of the model.
        If the model is still running, don't do anything with the trial.

        Parameters
        ----------
        trial : BaseTrial

        Examples
        --------
        trial.mark_completed()
        trial.mark_failed()
        trial.mark_abandoned()
        trial.mark_early_stopped()

        See Also
        --------
        # TODO add sphinx link to ax trial status

        """
        trial_config = load_yaml(self.paths_by_trial[trial.index]["trial_config_path"], normalize=False)

        log_file = Path(trial_config["model_options"]["log_file"])

        if log_file.exists():
            with open(log_file, "r") as f:
                contents = f.read()
            if "palmrun crashed" in contents:
                trial.mark_abandoned()
            elif "error:" in contents:
                trial.mark_failed()
            if "all OUTPUT-files saved" in contents:
                trial.mark_completed()

    def fetch_trial_data(self, trial: Trial, metric_properties: dict, metric_name: str,  *args, **kwargs):
        """
        Retrieves the trial data and prepares it for the metric(s) used in the objective
        function.

        For example, for a case where you are minimizing the error between a model and observations, using RMSE as a
        metric, this function would load the model output and the corresponding observation data that will be passed to
        the RMSE metric.

        The return value of this function is a dictionary, with keys that match the keys
        of the metric used in the objective function.
        # TODO work on this description

        Parameters
        ----------
        trial : Trial
        metric_properties: dict
        metric_name: str

        Returns
        -------
        dict
            A dictionary with the keys matching the keys of the metric function
                used in the objective
        """
        trial_config = trial.run_metadata["trial_config_path"]
        job_output_dir = trial_config["model_options"]["job_output_dir"]
        data_filepath = job_output_dir / "output.json"

        with open(data_filepath, 'r') as f:
            data = json.load(f)
        output = np.array(data["output"])
        return dict(a=output)

link to source: madeline-scyphers/palm_wrapper