Skip to content

[tune][xegpu] infrastructure for tuning, applied to XeGPU matmul example#63

Draft
rolfmorel wants to merge 1 commit intomainfrom
users/rolfmorel/lh-xegpu-autotuning
Draft

[tune][xegpu] infrastructure for tuning, applied to XeGPU matmul example#63
rolfmorel wants to merge 1 commit intomainfrom
users/rolfmorel/lh-xegpu-autotuning

Conversation

@rolfmorel
Copy link
Contributor

No description provided.

Comment on lines +72 to +74
if dump_schedule:
print(schedule_module)
sys.exit(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the previous implementation it was possible to dump the payload followed the schedule that produced it.

Comment on lines +69 to +73
assert 64 <= wg_m <= 256 and m % wg_m == 0 and wg_m % DPAS.M == 0
assert 64 <= wg_n <= 256 and n % wg_n == 0 and wg_n % DPAS.N == 0
assert 32 <= sg_m <= 128 and m % sg_m == 0 and sg_m % DPAS.M == 0
assert 32 <= sg_n <= 128 and n % sg_n == 0 and sg_n % DPAS.N == 0
assert 16 <= k_tile <= 50 and k % k_tile == 0 and k_tile % DPAS.K == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very nice! We do however need to have some mechanism to change the search space bounds from outside. E.g., sometimes you want to autotune with a larger search space. Maybe the bounds could be set in the constructor of the abstract schedule (i.e. one without concrete chosen param values)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants