You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Support complicated use cases with TiedLayerSpec (#7208)
I want to reuse a composed module in the pipeline. For example, the
following `MyModule` has a member `linear`, which is also a module.
```python
class MyModule(torch.nn.Module):
def __init__(self, n_in: int, n_out: int):
super().__init__()
self.linear = torch.nn.Linear(n_in, n_out)
self.layer_norm = torch.nn.LayerNorm(n_out)
def forward(self, data: torch.Tensor) -> torch.Tensor:
hidden = self.linear(data)
hidden = self.layer_norm(hidden)
return hidden
```
`MyModule.linear.weight` should be synchronized among related ranks. As
a result, I add `linear.weight` to `TiedLayerSpec.tied_weight_attr`.
BTW, I generate the whole `tied_weight_attr` by the following
instruction.
```python
tied_weight_attr = [name for name, p in layer.named_parameters() if p.numel() > 1]
```
However, the builtin `getattr` used by `PipelineModule` fails to find a
nested attribute like `linear.weight`.
Hence, this PR first extends the builtin `getattr` to a recursive
version `PipelineModule._recursive_getattr`, accessing each attribute
segment one by one.
Meanwhile, the order of tied weights matters in synchronization. This PR
suggests to sort tie_keys in `PipelineModule._index_tied_modules` to
avoid hanging.
Signed-off-by: Mingjie Li <[email protected]>
Co-authored-by: Mingjie Li <[email protected]>
Co-authored-by: Masahiro Tanaka <[email protected]>
0 commit comments