Skip to content

Conversation

jianyizh
Copy link
Contributor

@jianyizh jianyizh commented Sep 4, 2025

follows #1883, shape [4096,256,6,6] channel last with output shape [6,6] in torchbench alexnet can get ~4x improvement on bmg

@Copilot Copilot AI review requested due to automatic review settings September 4, 2025 03:09
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes the adaptive average pool operation by introducing a vectorized implementation for channel-last memory format. The changes add vectorization support to improve memory access patterns and performance for 2D adaptive average pooling operations.

Key changes:

  • Add vectorized kernel implementation for adaptive average pooling in channel-last format
  • Replace the original channel-last kernel with the new optimized vectorized version
  • Add necessary memory access utilities for vectorization support

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@jianyizh jianyizh force-pushed the jianyi/adptive_avg_pool branch from dc988db to e875db2 Compare September 8, 2025 04:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant