Running an R Script with BOA#
This notebook demonstrates how to:
Write a basic Wrapper script in R and have BOA launch your optimization using the BOA CLI interface. See instructions for creating a model wrapper for more details about creating a wrapper script.
You can also look at instructions for configurations files for more details on creating a configuration file.
Configuration File Overview#
config.yaml
optimization_options:
objective_options:
objectives:
- name: metric
experiment:
name: "r_streamlined_run"
trials: 15
parameters:
x0:
'bounds': [ 0, 1 ]
'type': 'range'
'value_type': 'float'
x1:
'bounds': [ 0, 1]
'type': 'range'
'value_type': 'float'
x2:
'bounds': [ 0, 1 ]
'type': 'range'
'value_type': 'float'
x3:
'bounds': [ 0, 1]
'type': 'range'
'value_type': 'float'
x4:
'bounds': [ 0, 1 ]
'type': 'range'
'value_type': 'float'
x5:
'bounds': [ 0, 1]
'type': 'range'
'value_type': 'float'
script_options:
# notice here that this is a shell command
# this is what BOA will do to launch your script
# it will also pass as a command line argument the current trial directory
# that is being parameterized
# This can either be a relative path or absolute path
# (by default when BOA launches from a config file
# it uses the config file directory as your working directory)
# here config.yaml and run_model.R are in the same directory
run_model: Rscript run_model.R
# options only needed by the model and not BOA
# You can put anything here that your model might need
# We don't need anything extra so we leave it commented out
# model_options:
# the_question: 42
Wrapper Run Script#
run_model.R
# load in any libraries and modules we need
library(jsonlite)
source("../r_utils/hartman6.R")
# This is where we read in from BOA the command line argument.
# If in your script, you use any other command line arguments,
# generally BOA's trial_dir should be the last command line arugment,
# so taking the last one should generally be safe.
args <- commandArgs(trailingOnly=TRUE)
trial_dir <- args[length(args)]
# this this trial_dir folder there are 2 files supplied by BOA,
# a parameters.json that has just the parameters, and a trial.json
# that includes the parameters and a lot more in case you need it.
# Most people will only need the parameters.json
param_path <- file.path(trial_dir, "parameters.json")
data <- read_json(path=param_path)
# The parameter keys config from whatever you named them in your
# config file, which you are free to change.
x0 <- data$x0
x1 <- data$x1
x2 <- data$x2
x3 <- data$x3
x4 <- data$x4
x5 <- data$x5
X <- c(x0, x1, x2, x3, x4, x5)
# This is where we actually run our "model".
# Here we are using a synthetic function called hartman6
# But you could substitute it for your own model in
# a number of ways.
res <- hartman6(X)
# In this case, we directly ran the model, so we are getting back a number
# or nan, so we know if it succeeded or failed. If you are submitting a job
# to an HPC (a super computer) queue, this might work, or you might have to
# rely on another method. Other options could be relying on log file output
# or information from querying the queue itself,
# though those may be better as stand alone `Set Trial Status Scripts`
if (!is.na(res)) {
# if it was a success, we don't even need to write out trial status,
# it is assumed a success if we write out data and don't fail
out_data <- list(
metric=res
# trial_status=unbox("COMPLETED") # this is optional if it succeeds
)
} else {
# If we fail, then we do need to include a trial status, and mark it as failed.
out_data <- list(
trial_status=unbox("FAILED")
)
}
json_data <- toJSON(out_data, pretty = TRUE)
write(json_data, file.path(trial_dir, "output.json"))
We also use a function called hartman6 which is a 6 dimensional version of the synthetic hartman function as the stand in for our model function. The code is below. You would substitute this for any call your model, be it local call to your own R model, a system call to a fortran model wrapped in your R script, or perhaps a some code that launches a HPC job and collects the results.
hartman6.R
hartman6 <- function(X) {
out <- tryCatch(
{
alpha <- c(1.0, 1.2, 3.0, 3.2)
A <- c(10, 3, 17, 3.5, 1.7, 8,
0.05, 10, 17, 0.1, 8, 14,
3, 3.5, 1.7, 10, 17, 8,
17, 8, 0.05, 10, 0.1, 14)
A <- matrix(A, 4, 6, byrow=TRUE)
P <- 10^(-4) * c(1312, 1696, 5569, 124, 8283, 5886,
2329, 4135, 8307, 3736, 1004, 9991,
2348, 1451, 3522, 2883, 3047, 6650,
4047, 8828, 8732, 5743, 1091, 381)
P <- matrix(P, 4, 6, byrow=TRUE)
Xmat <- matrix(rep(X,times=4), 4, 6, byrow=TRUE)
inner_sum <- rowSums(A[,1:6]*(Xmat-P[,1:6])^2)
outer_sum <- sum(alpha * exp(-inner_sum))
y <- -outer_sum
return(y)
},
error=function(cond) {
return(NA)
}
)
return(out)
}
Running our script#
To run our script we just need to path the config file to BOA’s CLI
python -m boa --config-file path/to/config.yaml
or
python -m boa -c path/to/config.yaml
[WARNING 07-13 14:30:49] ax.service.utils.with_db_settings_base: Ax currently requires a sqlalchemy version below 2.0. This will be addressed in a future release. Disabling SQL storage in Ax for now, if you would like to use SQL storage please install Ax with mysql extras via `pip install ax-platform[mysql]`.
[INFO 07-13 14:30:50] ax.service.utils.instantiation: Created search space: SearchSpace(parameters=[RangeParameter(name='x0', parameter_type=FLOAT, range=[0.0, 1.0]), RangeParameter(name='x1', parameter_type=FLOAT, range=[0.0, 1.0]), RangeParameter(name='x2', parameter_type=FLOAT, range=[0.0, 1.0]), RangeParameter(name='x3', parameter_type=FLOAT, range=[0.0, 1.0]), RangeParameter(name='x4', parameter_type=FLOAT, range=[0.0, 1.0]), RangeParameter(name='x5', parameter_type=FLOAT, range=[0.0, 1.0])], parameter_constraints=[]).
[INFO 07-13 14:30:50] ax.modelbridge.dispatch_utils: Using Models.GPEI since there are more ordered parameters than there are categories for the unordered categorical parameters.
[INFO 07-13 14:30:50] ax.modelbridge.dispatch_utils: Calculating the number of remaining initialization trials based on num_initialization_trials=None max_initialization_trials=None num_tunable_parameters=6 num_trials=None use_batch_trials=False
[INFO 07-13 14:30:50] ax.modelbridge.dispatch_utils: calculated num_initialization_trials=12
[INFO 07-13 14:30:50] ax.modelbridge.dispatch_utils: num_completed_initialization_trials=0 num_remaining_initialization_trials=12
[INFO 07-13 14:30:50] ax.modelbridge.dispatch_utils: Using Bayesian Optimization generation strategy: GenerationStrategy(name='Sobol+GPEI', steps=[Sobol for 12 trials, GPEI for subsequent trials]). Iterations after 12 will take longer to generate due to model-fitting.
[INFO 07-13 14:30:50] Scheduler: `Scheduler` requires experiment to have immutable search space and optimization config. Setting property immutable_search_space_and_opt_config to `True` on experiment.
[INFO 2023-07-13 14:30:50,865 MainProcess] boa:
##############################################
BOA Experiment Run
Output Experiment Dir: [/path/to/your/dir/]/r_streamlined_run_20230713T143050
Start Time: 20230713T143050
Version: 0.8.7.dev4+gae30cf2.d20230713
##############################################
[INFO 07-13 14:30:50] Scheduler: Running trials [0]...
[INFO 07-13 14:30:52] Scheduler: Running trials [1]...
[INFO 07-13 14:30:54] Scheduler: Running trials [2]...
[INFO 07-13 14:30:56] Scheduler: Running trials [3]...
[INFO 07-13 14:30:57] Scheduler: Running trials [4]...
[INFO 07-13 14:30:59] Scheduler: Running trials [5]...
[INFO 07-13 14:31:00] Scheduler: Running trials [6]...
[INFO 07-13 14:31:01] Scheduler: Running trials [7]...
[INFO 07-13 14:31:04] Scheduler: Running trials [8]...
[INFO 07-13 14:31:07] Scheduler: Running trials [9]...
[INFO 07-13 14:31:08] Scheduler: Retrieved COMPLETED trials: 0 - 9.
[INFO 07-13 14:31:08] Scheduler: Fetching data for trials: 0 - 9.
[INFO 2023-07-13 14:31:08,745 MainProcess] boa: Saved JSON-serialized state of optimization to `[/path/to/your/dir/]/r_streamlined_run_20230713T143050/scheduler.json`.
Boa version: 0.8.7.dev4+gae30cf2.d20230713
[INFO 2023-07-13 14:31:08,803 MainProcess] boa: Trials so far: 10
Running trials:
Will Produce next trials from generation step: Sobol
Best trial so far: {9: {'metric': -0.5032}}
[INFO 07-13 14:31:08] Scheduler: Running trials [10]...
[INFO 07-13 14:31:09] Scheduler: Running trials [11]...
[INFO 07-13 14:31:15] Scheduler: Running trials [12]...
[INFO 07-13 14:31:16] ax.modelbridge.torch: The observations are identical to the last set of observations used to fit the model. Skipping model fitting.
[INFO 07-13 14:31:21] Scheduler: Running trials [13]...
[INFO 07-13 14:31:22] ax.modelbridge.torch: The observations are identical to the last set of observations used to fit the model. Skipping model fitting.
[INFO 07-13 14:31:24] Scheduler: Running trials [14]...
[INFO 07-13 14:31:26] Scheduler: Retrieved COMPLETED trials: 10 - 14.
[INFO 07-13 14:31:26] Scheduler: Fetching data for trials: 10 - 14.
[INFO 2023-07-13 14:31:26,333 MainProcess] boa: Saved JSON-serialized state of optimization to `[/path/to/your/dir/]/r_streamlined_run_20230713T143050/scheduler.json`.
Boa version: 0.8.7.dev4+gae30cf2.d20230713
[INFO 2023-07-13 14:31:26,360 MainProcess] boa: Trials so far: 15
Running trials:
Will Produce next trials from generation step: GPEI
Best trial so far: {12: {'metric': -0.6201}}
[INFO 2023-07-13 14:31:26,375 MainProcess] boa: Saved JSON-serialized state of optimization to `[/path/to/your/dir/]/r_streamlined_run_20230713T143050/scheduler.json`.
Boa version: 0.8.7.dev4+gae30cf2.d20230713
[INFO 2023-07-13 14:31:26,400 MainProcess] boa: Trials so far: 15
Running trials:
Will Produce next trials from generation step: GPEI
Best trial so far: {12: {'metric': -0.6201}}
[INFO 2023-07-13 14:31:26,422 MainProcess] boa:
##############################################
Trials Completed!
BOA Experiment Run
Output Experiment Dir: [/path/to/your/dir/]/r_streamlined_run_20230713T143050
Start Time: 20230713T143050
Version: 0.8.7.dev4+gae30cf2.d20230713
End Time: 20230713T143126
Total Run Time: 35.53495001792908
trial_index arm_name trial_status ... x3 x4 x5
0 0 0_0 COMPLETED ... 0.755183 0.548820 0.555407
1 1 1_0 COMPLETED ... 0.441893 0.777394 0.218111
2 2 2_0 COMPLETED ... 0.833717 0.685323 0.358227
3 3 3_0 COMPLETED ... 0.337164 0.706950 0.410019
4 4 4_0 COMPLETED ... 0.842163 0.989052 0.966872
5 5 5_0 COMPLETED ... 0.081540 0.828128 0.128949
6 6 6_0 COMPLETED ... 0.138704 0.598400 0.496507
7 7 7_0 COMPLETED ... 0.723201 0.459715 0.961720
8 8 8_0 COMPLETED ... 0.614263 0.018990 0.856079
9 9 9_0 COMPLETED ... 0.324603 0.855542 0.100329
10 10 10_0 COMPLETED ... 0.955197 0.150812 0.603905
11 11 11_0 COMPLETED ... 0.646193 0.756861 0.555784
12 12 12_0 COMPLETED ... 0.301131 0.897643 0.021916
13 13 13_0 COMPLETED ... 0.379400 0.892975 0.053183
14 14 14_0 COMPLETED ... 0.280032 0.808321 0.174706
[15 rows x 11 columns]
##############################################