-
Notifications
You must be signed in to change notification settings - Fork 301
[WIP] Refactor development regularization cocktails #161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Refactor development regularization cocktails #161
Conversation
cs.add_condition(CS.EqualsCondition(shake_drop_prob, use_shake_drop, True)) | ||
cs.add_hyperparameters([use_sc, mb_choice, shake_drop_prob]) | ||
cs.add_condition(CS.EqualsCondition(mb_choice, use_sc, True)) | ||
# TODO check if shake_drop is as an option in mb_choice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, what do you mean here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I mean when mb_choice
(multibranch choice, maybe should be renamed) has no shake-drop
in the value_range, the condition that adds the shake_drop_prob
in line 176 will fail. This can be fixed with a simple if check, however, at that time, I wanted to do it quite fast. Maybe it makes sense to add that condition now
autoPyTorch/pipeline/components/training/data_loader/base_data_loader.py
Show resolved
Hide resolved
autoPyTorch/pipeline/components/training/trainer/base_trainer_choice.py
Outdated
Show resolved
Hide resolved
df3a549
to
fff9eda
Compare
…tional choice of shake-shake and shake-drop being hyperparameters as the choice for the multi branch networks
X[:, categorical_indices.long()] = self.CATEGORICAL_VALUE | ||
X[:, numerical_indices.long()] = self.NUMERICAL_VALUE | ||
|
||
lam = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the definition of lam
?
default_value=True), | ||
se_lastk: HyperparameterSearchSpace = HyperparameterSearchSpace( | ||
hyperparameter="se_lastk", | ||
value_range=(3,), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value_range=(3,), | |
value_range=(3, 3), |
cs.add_hyperparameter(use_snapshot_ensemble) | ||
|
||
if snapshot_ensemble_flag: | ||
se_lastk = get_hyperparameter(se_lastk, Constant) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we know that se_lastk
is constant?
People can feed hyperparameter_space
that makes this variable non-constant unless we remove it from the argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, we should fix this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is constant in our case since that is what we did for the publication, however, it can be a hyperparameter. Only thing is it has to be a hyperparameter that is lower or equal to the number of restarts, it needs this condition.
) -> ConfigurationSpace: | ||
cs = ConfigurationSpace() | ||
|
||
# The number of groups that will compose the resnet. That is, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is too complex (flake8 C901).
if self.config["use_skip_connection"]: | ||
residual = self.shortcut(x) | ||
|
||
# TODO make the below code better | ||
if self.config["use_skip_connection"]: | ||
if self.config["multi_branch_choice"] == 'shake-shake': | ||
x1 = self.layers(x) | ||
x2 = self.shake_shake_layers(x) | ||
alpha, beta = shake_get_alpha_beta(self.training, x.is_cuda) | ||
x = shake_shake(x1, x2, alpha, beta) | ||
else: | ||
x = self.layers(x) | ||
else: | ||
x = self.layers(x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to separate the series of operations for skip connection as another method because it does not make sense to judge if self.config["use_skip_connection"]
four times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True it can be improved
autoPyTorch/pipeline/components/setup/network_backbone/ShapedResNetBackbone.py
Show resolved
Hide resolved
X_adversarial = self.fgsm_attack(X, y) | ||
return (X, X_adversarial), {'y_a': y} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to return both before and after the perturbation?
If fgsm_attack
is a destructive method, it should be fixed.
Otherwise, X
will not change anyways from the lines where this function is called.
So it does not make sense to return the original X
either way.
original_data = torch.autograd.Variable(original_data) | ||
adversarial_data = torch.autograd.Variable(adversarial_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just do it by torch.tensor
with requires_grad=True
?
If I remember correctly, Variable
is already deprecated and not recommended.
Reference here.
torch.tensor doc here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure we can change it later before merging unless it affects the performance, which I dont think it does
data_copy = deepcopy(data) | ||
data_copy = data_copy.float().to(self.device) | ||
targets = targets.long().to(self.device) | ||
data_copy = torch.autograd.Variable(data_copy) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use torch.tensor
autoPyTorch/pipeline/components/training/trainer/AdversarialTrainer.py
Outdated
Show resolved
Hide resolved
'shortname': 'AdversarialTrainer', | ||
'name': 'AdversarialTrainer', | ||
'handles_tabular': True, | ||
'handles_image': False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why False?
autoPyTorch/pipeline/components/training/trainer/AdversarialTrainer.py
Outdated
Show resolved
Hide resolved
value_range=(0.05, 0.2), | ||
default_value=0.2), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not know the good range for this parameter, but shouldn't the range be larger and we should apply a log-scale for this param.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ArlindKadra has an idea of what it should be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleted most comments that require more time to address.
# Treat a column with all instances a NaN as numerical | ||
# This will prevent doing encoding to a categorical column made completely | ||
# out of nan values -- which will trigger a fail, as encoding is not supported | ||
# with nan values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other issues in this file:
- Access to member
object_dtype_mapping
before its definition in line 433 - Unused variable enc_columns in line 257
shape=self.config['dropout_shape'], | ||
in_feat=0, | ||
out_feat=0, | ||
max_neurons=self.config["max_dropout"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max_neurons
seems to be int
, but config["max_dropout"]
seems to be float
.
numerical_indices = torch.tensor(self.numerical_columns) | ||
categorical_indices = torch.tensor([idx for idx in row_indices if idx not in self.numerical_columns]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems numerical_indices
and categorical_indices
are not compatible.
If I understand the implementation correctly, the following should be correct.
numerical_indices = torch.tensor(self.numerical_columns) | |
categorical_indices = torch.tensor([idx for idx in row_indices if idx not in self.numerical_columns]) | |
numerical_indices = torch.tensor([idx for idx in row_indices if idx in self.numerical_columns]) | |
categorical_indices = torch.tensor([idx for idx in row_indices if idx not in self.numerical_columns]) |
autoPyTorch/pipeline/components/training/trainer/RowCutOutTrainer.py
Outdated
Show resolved
Hide resolved
|
||
class RowCutOutTrainer(CutOut, BaseTrainerComponent): | ||
# 0 is non-informative in image data | ||
NUMERICAL_VALUE = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it correct although this class is for tabular data rather than image datasets?
) | ||
) | ||
|
||
X[:, row_indices] = X[batch_indices, :][:, row_indices] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The shape and what exactly is it doing.
autoPyTorch/pipeline/components/training/trainer/RowCutOutTrainer.py
Outdated
Show resolved
Hide resolved
* Update implementation * Coding style fixes * Implementation update * Style fix * Turn weighted loss into a constant again, implementation update * Cocktail branch inconsistencies (#275) * To nemo * Revert change in T_curr as results conclusively prove it should be 0 * Revert cutmix change after data from run * Final conclusion after results * FIX bug in shake alpha beta * Updated if is_training condition for shake drop * Remove temp fix in row cutmic * Cocktail fixes time debug (#286) * preprocess inside data validator * add time debug statements * Add fixes for categorical data * add fit_ensemble * add arlind fix for swa and se * fix bug in trainer choice fit * fix ensemble bug * Correct bug in cleanup * Cleanup for removing time debug statements * ablation for adversarial * shuffle false in dataloader * drop last false in dataloader * fix bug for validation set, and cutout and cutmix * shuffle = False * Shake Shake updates (#287) * To test locally * fix bug in trainer choice fit * fix ensemble bug * Correct bug in cleanup * To test locally * Cleanup for removing time debug statements * ablation for adversarial * shuffle false in dataloader * drop last false in dataloader * fix bug for validation set, and cutout and cutmix * To test locally * shuffle = False * To test locally * updates to search space * updates to search space * update branch with search space * undo search space update * fix bug in shake shake flag * limit to shake-even * restrict to even even * Add even even and others for shake-drop also * fix bug in passing alpha beta method * restrict to only even even * fix silly bug: * remove imputer and ordinal encoder for categorical transformer in feature validator * Address comments from shuhei * fix issues with ensemble fitting post hoc * Address comments on the PR * Fix flake and mypy errors * Address comments from PR #286 * fix bug in embedding * Update autoPyTorch/api/tabular_classification.py Co-authored-by: nabenabe0928 <[email protected]> * Update autoPyTorch/datasets/base_dataset.py Co-authored-by: nabenabe0928 <[email protected]> * Update autoPyTorch/datasets/base_dataset.py Co-authored-by: nabenabe0928 <[email protected]> * Update autoPyTorch/pipeline/components/training/trainer/base_trainer.py Co-authored-by: nabenabe0928 <[email protected]> * Address comments from shuhei * adress comments from shuhei * fix flake and mypy * Update autoPyTorch/pipeline/components/training/trainer/RowCutMixTrainer.py Co-authored-by: nabenabe0928 <[email protected]> * Update autoPyTorch/pipeline/tabular_classification.py Co-authored-by: nabenabe0928 <[email protected]> * Update autoPyTorch/pipeline/components/setup/network_backbone/utils.py Co-authored-by: nabenabe0928 <[email protected]> * Update autoPyTorch/pipeline/components/setup/network_backbone/utils.py Co-authored-by: nabenabe0928 <[email protected]> * Update autoPyTorch/pipeline/components/setup/network_backbone/utils.py Co-authored-by: nabenabe0928 <[email protected]> * Apply suggestions from code review Co-authored-by: nabenabe0928 <[email protected]> * increase threads_per_worker * fix bug in rowcutmix * Enhancement for the tabular validator. (#291) * Initial try at an enhancement for the tabular validator * Adding a few type annotations * Fixing bugs in implementation * Adding wrongly deleted code part during rebase * Fix bug in _get_args * Fix bug in _get_args * Addressing Shuhei's comments * Address Shuhei's comments * Refactoring code * Refactoring code * Typos fix and additional comments * Replace nan in categoricals with simple imputer * Remove unused function * add comment * Update autoPyTorch/data/tabular_feature_validator.py Co-authored-by: nabenabe0928 <[email protected]> * Update autoPyTorch/data/tabular_feature_validator.py Co-authored-by: nabenabe0928 <[email protected]> * Adding unit test for only nall columns in the tabular feature categorical evaluator * fix bug in remove all nan columns * Bug fix for making tests run by arlind * fix flake errors in feature validator * made typing code uniform * Apply suggestions from code review Co-authored-by: nabenabe0928 <[email protected]> * address comments from shuhei * address comments from shuhei (2) Co-authored-by: Ravin Kohli <[email protected]> Co-authored-by: Ravin Kohli <[email protected]> Co-authored-by: nabenabe0928 <[email protected]> * Apply suggestions from code review Co-authored-by: nabenabe0928 <[email protected]> * resolve code issues with new versions * Address comments from shuhei * make run_traditional_ml function * implement suggestion from shuhei and fix bug in rowcutmixtrainer * fix return type docstring * add better documentation and fix bug in shake_drop_get_bl * Apply suggestions from code review Co-authored-by: nabenabe0928 <[email protected]> * add test for comparator and other improvements based on PR comments * fix bug in test * [fix] Fix the condition in the raising error of all_nan_columns * [refactor] Unite name conventions of numpy array and pandas dataframe * [doc] Add the description about the tabular feature transformation * [doc] Add the description of the tabular feature transformation * address comments from arlind * address comments from arlind * change to as_tensor and address comments from arlind * correct description for functions in data module Co-authored-by: nabenabe0928 <[email protected]> Co-authored-by: Arlind Kadra <[email protected]> Co-authored-by: nabenabe0928 <[email protected]> * Addressing Shuhei's comments * flake8 problems fix * Update autoPyTorch/api/base_task.py Add indent. Co-authored-by: Ravin Kohli <[email protected]> * Update autoPyTorch/api/base_task.py Add indent. Co-authored-by: Ravin Kohli <[email protected]> * Update autoPyTorch/data/tabular_feature_validator.py Add indentation. Co-authored-by: Ravin Kohli <[email protected]> * Update autoPyTorch/pipeline/components/setup/network_backbone/utils.py Add line indentation. Co-authored-by: Ravin Kohli <[email protected]> * Update autoPyTorch/data/tabular_feature_validator.py Validate if there is a column transformer since for sparse matrices we will not have one. Co-authored-by: Ravin Kohli <[email protected]> * Update autoPyTorch/utils/implementations.py Delete uncommented line. Co-authored-by: Ravin Kohli <[email protected]> * Allow the number of threads to be given by the user * Removing unnecessary argument and refactoring the attribute. * Addressing Ravin's comments * Update autoPyTorch/pipeline/components/setup/network_backbone/utils.py Updating the function documentation according to the agreed style. Co-authored-by: Ravin Kohli <[email protected]> * Update autoPyTorch/pipeline/components/setup/network_backbone/utils.py Providing information on the wrong method provided for shake-shake regularization. Co-authored-by: nabenabe0928 <[email protected]> * add todo for backend and accept changes from shuhei * Addressing Shuhei's and Ravin's comments * Addressing Shuhei's and Ravin's comments, bug fix * Update autoPyTorch/pipeline/components/setup/network_backbone/ResNetBackbone.py Improving code readibility. Co-authored-by: nabenabe0928 <[email protected]> * Update autoPyTorch/pipeline/components/setup/network_backbone/ResNetBackbone.py Improving consistency. Co-authored-by: nabenabe0928 <[email protected]> * bug fix Co-authored-by: Ravin Kohli <[email protected]> Co-authored-by: nabenabe0928 <[email protected]> Co-authored-by: nabenabe0928 <[email protected]> Co-authored-by: Ravin Kohli <[email protected]>
* Initial fix for all tests passing locally py=3.8 * fix bug in tests * fix bug in test for data * debugging error in dummy forward pass * debug try -2 * catch runtime error in ci * catch runtime error in ci * add better debug test setup * debug some more * run this test only * remove sum backward * remove inplace in inception block * undo silly change * Enable all tests * fix flake * fix bug in test setup * remove anamoly detection * minor changes to comments * Apply suggestions from code review Co-authored-by: nabenabe0928 <[email protected]> * Address comments from Shuhei * revert change leading to bug * fix flake * change comment position in feature validator * Add documentation for _is_datasets_consistent * address comments from arlind * case when all nans in test Co-authored-by: nabenabe0928 <[email protected]>
An updated regularization cocktails branch with the refactored development progress, changes in the overall implementation to adapt with the changes in refactored development.
Includes all the cocktails from the Regularization Cocktails for Tabular Datasets.
Not to be merged at this moment.