After self-implementing a grid-search but having a horrible timewriting pyplot visualizing the result, I finally decided to find anexisting tool to do the HP tuning for me.
<p>There are two popular HP tuning framework</p><ul><li>RayTune:almost industry standard</li><li>Optuna: user friendly, requiresminimal modification to original code</li></ul><p>There’s also <ahref=”https://github.com/skorch-dev/skorch”>skorch</a> integratingscikit-learn and pytorch, so you can use <ahref=”https://skorch.readthedocs.io/en/v1.0.0/user/quickstart.html#grid-search”>sklearnGridSearchCV
</a>. For our simple task, we will go withOptuna
.</p><h2 id="getting-started">Getting Started</h2><p>To get Optuna running, you just need to add 4 lines in your traininglogic and a few more lines to start its search. In training logic:</p>
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
</pre></td><td class="code"><pre>def train_model(image_datasets, lr, weight_decay, num_epochs, trial : optuna.trial.Trial=None):
optimizer = optim.AdamW(model.parameters(), lr=lr, weight_decay=weight_decay)
for epoch in range(num_epochs):
model.train()
for inputs, labels in dataloaders["train"]:
# Training Logic
model.eval()
for inputs, labels in dataloaders["val"]:
running_loss += loss.item() * inputs.size(0)
# Eval Logic
epoch_loss = running_loss / dataset_sizes["val"]
if epoch_acc > best_acc or (epoch_acc == best_acc and epoch_loss < best_loss):
best_acc, best_loss = epoch_acc, epoch_loss
""" OPTUNA CODE GOES HERE:
For each epoch, you should report value of a user-defined factor.
Optuna uses this factor alone to determine whether to prune out
this trial at current epoch step. Your objective value returned
has nothing to do with pruning.
Read for more at: https://optuna.readthedocs.io/en/v3.6.1/reference/generated/optuna.trial.Trial.html#optuna.trial.Trial.report
"""
if trial is not None:
trial.report(epoch_loss, epoch)
if trial.should_prune():
raise optuna.exceptions.TrialPruned()
return best_loss
</pre></td></tr></table>
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
</pre></td><td class="code"><pre>def optuna_objective(trial : optuna.trial.Trial):
""" Define a custom objective function we want to optimize on.
This function returns value of the criteria you want to finally evaluate your model on.
i.e. how you compare different models. The best model should have the best value of this objective.
If you say the best model should have highest training accuracy at the last epoch, then return training accuracy at the last epoch here. In our example, we think the best model should have the best best_loss
, where a model's best_loss
is this model's lowest validation loss across all epochs.
"""
image_datasets = prepare_data()
lr = trial.suggest_float("lr", 1e-6, 1e-1, log=True)
weight_decay = trial.suggest_float("weight_decay", 1e-6, 1e-1, log=True)
loss = train_model(image_datasets, lr, weight_decay, 15, trial)
return loss
if name == "main":
"""
Create a study called plant_144
where we minimize the objective passed in.
Start the search. The search ends when we finish 10 trials or spend 3 hours.
"""
study = optuna.create_study(
direction="minimize",
study_name="plant_144")
study.optimize(optuna_objective, n_trials=10, timeout=36060)
print(" Objective Value: ", study.best_trial.value)
print(" Params: ")
for key, value in study.best_trial.params.items():
print(f" {key}: {value}")
</pre></td></tr></table>storage
argument</p>
2
3
4
</pre></td><td class="code"><pre>study = optuna.create_study(
direction="minimize",
study_name="plant_144",
storage="sqlite:///db.sqlite3")
</pre></td></tr></table>db.sqlite3
under currentdirectory.</p><p>This file is a general database and can store study other than theone called plant_144
. You can store another study insideit.</p>
2
3
4
</pre></td><td class="code"><pre>study = optuna.create_study(
direction="maximize",
study_name="plant_8",
storage="sqlite:///db.sqlite3")
</pre></td></tr></table>
</pre></td><td class="code"><pre>optuna-dashboard sqlite:///db.sqlite3
</pre></td></tr></table><figcaption aria-hidden="true">optuna-dashboard</figcaption>
2
3
</pre></td><td class="code"><pre>if name == 'main':
study = optuna.load_study(study_name='plant_144', storage='sqlite:///db.sqlite3')
study.optimize(objective, n_trials=100)
</pre></td></tr></table>
2
</pre></td><td class="code"><pre>CUDA_VISIBLE_DEVICES=3 nohup python optuna.py > log3.txt 2>&1 &
CUDA_VISIBLE_DEVICES=5 nohup python optuna.py > log5.txt 2>&1 &
</pre></td></tr></table>plant_144
in file db.sqlite3
.</p><p>For more information on parallelizing on multiple gpu, check officialguide: <ahref=”https://optuna.readthedocs.io/en/v3.6.1/tutorial/10_key_features/004_distributed.html”>EasyParallelization</a></p><h2 id="some-complaints">Some Complaints</h2><p>In its visualization, Optuna doesn’t provide an option to filter outthe “bad” trial runs, making the scale of all graph ridiculous andusually of no information.</p>