Running an R Script with BOA#
This notebook demonstrates how to:
Write a basic Wrapper script in R and have BOA launch your optimization using the BOA CLI interface. See instructions for creating a model wrapper for more details about creating a wrapper script.
You can also look at instructions for configurations files for more details on creating a configuration file.
Configuration File Overview#
config.yaml
optimization_options:
objective_options:
objectives:
- name: metric
experiment:
name: "r_streamlined_run"
trials: 15
parameters:
x0:
'bounds': [ 0, 1 ]
'type': 'range'
'value_type': 'float'
x1:
'bounds': [ 0, 1]
'type': 'range'
'value_type': 'float'
x2:
'bounds': [ 0, 1 ]
'type': 'range'
'value_type': 'float'
x3:
'bounds': [ 0, 1]
'type': 'range'
'value_type': 'float'
x4:
'bounds': [ 0, 1 ]
'type': 'range'
'value_type': 'float'
x5:
'bounds': [ 0, 1]
'type': 'range'
'value_type': 'float'
script_options:
# notice here that this is a shell command
# this is what BOA will do to launch your script
# it will also pass as a command line argument the current trial directory
# that is being parameterized
# This can either be a relative path or absolute path
# (by default when BOA launches from a config file
# it uses the config file directory as your working directory)
# here config.yaml and run_model.R are in the same directory
run_model: Rscript run_model.R
# options only needed by the model and not BOA
# You can put anything here that your model might need
# We don't need anything extra so we leave it commented out
# model_options:
# the_question: 42
Wrapper Run Script#
run_model.R
# load in any libraries and modules we need
library(jsonlite)
source("../r_utils/hartman6.R")
# This is where we read in from BOA the command line argument.
# If in your script, you use any other command line arguments,
# generally BOA's trial_dir should be the last command line arugment,
# so taking the last one should generally be safe.
args <- commandArgs(trailingOnly=TRUE)
trial_dir <- args[length(args)]
# this this trial_dir folder there are 2 files supplied by BOA,
# a parameters.json that has just the parameters, and a trial.json
# that includes the parameters and a lot more in case you need it.
# Most people will only need the parameters.json
param_path <- file.path(trial_dir, "parameters.json")
data <- read_json(path=param_path)
# The parameter keys config from whatever you named them in your
# config file, which you are free to change.
x0 <- data$x0
x1 <- data$x1
x2 <- data$x2
x3 <- data$x3
x4 <- data$x4
x5 <- data$x5
X <- c(x0, x1, x2, x3, x4, x5)
# This is where we actually run our "model".
# Here we are using a synthetic function called hartman6
# But you could substitute it for your own model in
# a number of ways.
res <- hartman6(X)
# In this case, we directly ran the model, so we are getting back a number
# or nan, so we know if it succeeded or failed. If you are submitting a job
# to an HPC (a super computer) queue, this might work, or you might have to
# rely on another method. Other options could be relying on log file output
# or information from querying the queue itself,
# though those may be better as stand alone `Set Trial Status Scripts`
if (!is.na(res)) {
# if it was a success, we don't even need to write out trial status,
# it is assumed a success if we write out data and don't fail
out_data <- list(
metric=res
# trial_status=unbox("COMPLETED") # this is optional if it succeeds
)
} else {
# If we fail, then we do need to include a trial status, and mark it as failed.
out_data <- list(
trial_status=unbox("FAILED")
)
}
json_data <- toJSON(out_data, pretty = TRUE)
write(json_data, file.path(trial_dir, "output.json"))
We also use a function called hartman6 which is a 6 dimensional version of the synthetic hartman function as the stand in for our model function. The code is below. You would substitute this for any call your model, be it local call to your own R model, a system call to a fortran model wrapped in your R script, or perhaps a some code that launches a HPC job and collects the results.
hartman6.R
hartman6 <- function(X) {
out <- tryCatch(
{
alpha <- c(1.0, 1.2, 3.0, 3.2)
A <- c(10, 3, 17, 3.5, 1.7, 8,
0.05, 10, 17, 0.1, 8, 14,
3, 3.5, 1.7, 10, 17, 8,
17, 8, 0.05, 10, 0.1, 14)
A <- matrix(A, 4, 6, byrow=TRUE)
P <- 10^(-4) * c(1312, 1696, 5569, 124, 8283, 5886,
2329, 4135, 8307, 3736, 1004, 9991,
2348, 1451, 3522, 2883, 3047, 6650,
4047, 8828, 8732, 5743, 1091, 381)
P <- matrix(P, 4, 6, byrow=TRUE)
Xmat <- matrix(rep(X,times=4), 4, 6, byrow=TRUE)
inner_sum <- rowSums(A[,1:6]*(Xmat-P[,1:6])^2)
outer_sum <- sum(alpha * exp(-inner_sum))
y <- -outer_sum
return(y)
},
error=function(cond) {
return(NA)
}
)
return(out)
}
Running our script#
To run our script we just need to path the config file to BOA’s CLI
python -m boa --config-file path/to/config.yaml
or
python -m boa -c path/to/config.yaml
[WARNING 08-09 18:51:27] ax.service.utils.with_db_settings_base: Ax currently requires a sqlalchemy version below 2.0. This will be addressed in a future release. Disabling SQL storage in Ax for now, if you would like to use SQL storage please install Ax with mysql extras via `pip install ax-platform[mysql]`.
[INFO 08-09 18:51:28] ax.service.utils.instantiation: Created search space: SearchSpace(parameters=[RangeParameter(name='x0', parameter_type=FLOAT, range=[0.0, 1.0]), RangeParameter(name='x1', parameter_type=FLOAT, range=[0.0, 1.0]), RangeParameter(name='x2', parameter_type=FLOAT, range=[0.0, 1.0]), RangeParameter(name='x3', parameter_type=FLOAT, range=[0.0, 1.0]), RangeParameter(name='x4', parameter_type=FLOAT, range=[0.0, 1.0]), RangeParameter(name='x5', parameter_type=FLOAT, range=[0.0, 1.0])], parameter_constraints=[]).
[INFO 08-09 18:51:28] ax.modelbridge.dispatch_utils: Using Models.GPEI since there are more ordered parameters than there are categories for the unordered categorical parameters.
[INFO 08-09 18:51:28] ax.modelbridge.dispatch_utils: Calculating the number of remaining initialization trials based on num_initialization_trials=None max_initialization_trials=None num_tunable_parameters=6 num_trials=None use_batch_trials=False
[INFO 08-09 18:51:28] ax.modelbridge.dispatch_utils: calculated num_initialization_trials=12
[INFO 08-09 18:51:28] ax.modelbridge.dispatch_utils: num_completed_initialization_trials=0 num_remaining_initialization_trials=12
[INFO 08-09 18:51:28] ax.modelbridge.dispatch_utils: Using Bayesian Optimization generation strategy: GenerationStrategy(name='Sobol+GPEI', steps=[Sobol for 12 trials, GPEI for subsequent trials]). Iterations after 12 will take longer to generate due to model-fitting.
[INFO 08-09 18:51:28] Scheduler: `Scheduler` requires experiment to have immutable search space and optimization config. Setting property immutable_search_space_and_opt_config to `True` on experiment.
[INFO 2023-08-09 18:51:28,752 MainProcess] boa:
##############################################
BOA Experiment Run
Output Experiment Dir: [/path/to/your/dir/]/r_streamlined_run_20230809T185128
Start Time: 20230809T185128
Version: 0.8.8.dev0+gd6e453f.d20230809
##############################################
[INFO 08-09 18:51:28] Scheduler: Running trials [0]...
[INFO 08-09 18:51:30] Scheduler: Running trials [1]...
[INFO 08-09 18:51:31] Scheduler: Running trials [2]...
[INFO 08-09 18:51:32] Scheduler: Running trials [3]...
[INFO 08-09 18:51:33] Scheduler: Running trials [4]...
[INFO 08-09 18:51:34] Scheduler: Running trials [5]...
[INFO 08-09 18:51:36] Scheduler: Running trials [6]...
[INFO 08-09 18:51:37] Scheduler: Running trials [7]...
[INFO 08-09 18:51:38] Scheduler: Running trials [8]...
[INFO 08-09 18:51:39] Scheduler: Running trials [9]...
[INFO 08-09 18:51:41] Scheduler: Retrieved COMPLETED trials: 0 - 9.
[INFO 08-09 18:51:41] Scheduler: Fetching data for trials: 0 - 9.
[INFO 2023-08-09 18:51:41,229 MainProcess] boa: Saved JSON-serialized state of optimization to `[/path/to/your/dir/]/r_streamlined_run_20230809T185128/scheduler.json`.
Boa version: 0.8.8.dev0+gd6e453f.d20230809
[INFO 2023-08-09 18:51:41,247 MainProcess] boa: Saved optimization parametrization and objective to `[/path/to/your/dir/]/r_streamlined_run_20230809T185128/optimization.csv`.
[INFO 2023-08-09 18:51:41,261 MainProcess] boa: Trials so far: 10
Running trials:
Will Produce next trials from generation step: Sobol
Best trial so far: {3: {'metric': -0.8066}}
[INFO 08-09 18:51:41] Scheduler: Running trials [10]...
[INFO 08-09 18:51:42] Scheduler: Running trials [11]...
[INFO 08-09 18:51:45] Scheduler: Running trials [12]...
[INFO 08-09 18:51:46] ax.modelbridge.torch: The observations are identical to the last set of observations used to fit the model. Skipping model fitting.
[INFO 08-09 18:51:48] Scheduler: Running trials [13]...
[INFO 08-09 18:51:49] ax.modelbridge.torch: The observations are identical to the last set of observations used to fit the model. Skipping model fitting.
[INFO 08-09 18:51:51] Scheduler: Running trials [14]...
[INFO 08-09 18:51:52] Scheduler: Retrieved COMPLETED trials: 10 - 14.
[INFO 08-09 18:51:52] Scheduler: Fetching data for trials: 10 - 14.
[INFO 2023-08-09 18:51:52,889 MainProcess] boa: Saved JSON-serialized state of optimization to `[/path/to/your/dir/]/r_streamlined_run_20230809T185128/scheduler.json`.
Boa version: 0.8.8.dev0+gd6e453f.d20230809
[INFO 2023-08-09 18:51:52,907 MainProcess] boa: Saved optimization parametrization and objective to `[/path/to/your/dir/]/r_streamlined_run_20230809T185128/optimization.csv`.
[INFO 2023-08-09 18:51:52,923 MainProcess] boa: Trials so far: 15
Running trials:
Will Produce next trials from generation step: GPEI
Best trial so far: {14: {'metric': -1.0227}}
[INFO 2023-08-09 18:51:52,939 MainProcess] boa: Saved JSON-serialized state of optimization to `[/path/to/your/dir/]/r_streamlined_run_20230809T185128/scheduler.json`.
Boa version: 0.8.8.dev0+gd6e453f.d20230809
[INFO 2023-08-09 18:51:52,957 MainProcess] boa: Saved optimization parametrization and objective to `[/path/to/your/dir/]/r_streamlined_run_20230809T185128/optimization.csv`.
[INFO 2023-08-09 18:51:52,972 MainProcess] boa: Trials so far: 15
Running trials:
Will Produce next trials from generation step: GPEI
Best trial so far: {14: {'metric': -1.0227}}
[INFO 2023-08-09 18:51:53,001 MainProcess] boa:
##############################################
Trials Completed!
BOA Experiment Run
Output Experiment Dir: [/path/to/your/dir/]/r_streamlined_run_20230809T185128
Start Time: 20230809T185128
Version: 0.8.8.dev0+gd6e453f.d20230809
End Time: 20230809T185152
Total Run Time: 24.220675230026245
trial_index arm_name trial_status ... x3 x4 x5
0 0 0_0 COMPLETED ... 0.629275 0.137796 0.515604
1 1 1_0 COMPLETED ... 0.602257 0.457678 0.505580
2 2 2_0 COMPLETED ... 0.930436 0.081166 0.682230
3 3 3_0 COMPLETED ... 0.213464 0.156547 0.999392
4 4 4_0 COMPLETED ... 0.889072 0.460256 0.870208
5 5 5_0 COMPLETED ... 0.507055 0.845420 0.125827
6 6 6_0 COMPLETED ... 0.355159 0.336532 0.021124
7 7 7_0 COMPLETED ... 0.657354 0.645192 0.058214
8 8 8_0 COMPLETED ... 0.198422 0.875165 0.143583
9 9 9_0 COMPLETED ... 0.255256 0.807304 0.604236
10 10 10_0 COMPLETED ... 0.538395 0.418263 0.199643
11 11 11_0 COMPLETED ... 0.957110 0.653261 0.482938
12 12 12_0 COMPLETED ... 0.217348 0.181392 0.974875
13 13 13_0 COMPLETED ... 0.176546 0.106343 1.000000
14 14 14_0 COMPLETED ... 0.286782 0.250568 1.000000
[15 rows x 11 columns]
##############################################