Fix fused_qkv print model ValueError (#7109)

Yejing-Lai · hwchen2017 · loadams · web-flow · commit 71807bceba29 · 2025-03-04T21:17:12.000Z
Suppose qkv_linear_weight_shape = [in_features, out_features].
The qkv linear weight shape is [3, in_features, out_features] if using
fued_qkv gemm optimization. It will cause "ValueError: too many values
to unpack (expected 2)" issue when printing the model.

Solution: Take the last two weight dimensions shapes as in_features and
out_features.

Signed-off-by: Lai, Yejing &lt;yejing.lai@intel.com&gt;
Co-authored-by: Hongwei Chen &lt;33092912+hwchen2017@users.noreply.github.com&gt;
Co-authored-by: Logan Adams &lt;114770087+loadams@users.noreply.github.com&gt;
diff --git a/deepspeed/module_inject/layers.py b/deepspeed/module_inject/layers.py
@@ -229,7 +229,7 @@ def __deepcopy__(self, memo):
 
     def extra_repr(self):
         if self.weight is not None:
-            out_features, in_features = self.weight.shape if self.weight is not None else (None, None)
+            out_features, in_features = self.weight.shape[-2:] if self.weight is not None else (None, None)
             dtype = self.weight.dtype if self.weight is not None else None
             extra_repr_str = "in_features={}, out_features={}, bias={}, dtype={}".format(
                 in_features, out_features, self.bias is not None, dtype)