-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Description
Proposed refactor
Issues
- The Tuner has been causing a lot of issues in the past and we plan to refactor it. The primary reason comes from the trainer state snapshotting and restoration.
- Auto batch size scaling doesn't work with validate/test/predict. Users might want to identify an optimal batch_size for inference to better utilize their available compute resources.
- LR Finder suggestion is not optimal. Sometimes it suggests bad LR as per its algorithm, and sometimes it doesn't suggest anything at all.
- Doesn't work with flash finetuning. For eg. let's say the user might want to compute new LR or new batch_size after certain epochs of pre-training, then it's not easily configurable within a single call. One can achieve it with multiple calls but since we support strategies within Flash, this might be worth adding.
ps: please add more issues up here if you have any regarding the tuner.
Possible solutions
- We can subclass Trainer for tuner and create independent states so that we don't do any sort of snapshotting and restoration with trainer states and it will stay independent.
class Tuner(Trainer):
# create independent states
# create custom loops
trainer.tuner(auto_scale_batch_size=..., auto_lr_find=...).fit()
trainer.tuner(auto_scale_batch_size=...).predict()
well, this solution could possibly solve 1
& 2
but possibly can't be configured to solve 4
.
-
Another solution proposed by @Borda is to make them as callbacks, so that they can be easily configured by users independently and can help resolve
4
. But this solution might not resolve1
&2
. -
Another solution @Borda and @SkafteNicki suggested, for now, is to move lr_finder to bolts and experiment there and improve scale_batch_size within lightning. But possibly it can't guarantee to solve
4
.
Additional context
Other issues with the tuner right now we need to address:
#9625
#10560
#10557
thanks to @Borda @SkafteNicki @ethanwharris @akihironitta for helping out with the discussion and possible solutions.