-
Notifications
You must be signed in to change notification settings - Fork 85
Fix to allow all gemm tile shapes #738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
e6eeeb6
db2e83b
5b3fd20
8122221
e8cf84f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -124,6 +124,25 @@ If these environment variables are not set, the installation process will infer | |
| * `CUTLASS_PATH`: either one directory level above the current directory (i.e., `$(pwd)/..`) if installed locally or in the `source` directory of the location in which `cutlass_library` was installed | ||
| * `ONEAPI_ROOT`: the default Intel oneAPI installation path | ||
|
|
||
| #### Performance related environment variables | ||
|
|
||
| For improving performance on Intel PVC/BMG you could try the following: | ||
|
|
||
| * `export IGC_ExtraOCLOptions="-cl-intel-256-GRF-per-thread"` | ||
|
|
||
| Please refer to [Building with Sycl Support](../media/docs/cpp/build/building_with_sycl_support.md#building-with-sycl-for-intel-gpu-support) for the omplete environment setup. | ||
|
|
||
| * `CUTLASS_SYCL_ADDITIONAL_TILE_SHAPES` : Path to JSON file containing workgroup and subgroup tile sizes meant for Intel Xe architecture. Expected format s a list of dictionaries like [{"wg": [256, 256, 32], "sg": [8,4,1]}, ...]. Here `wg` refers to the workgroup tile shape and `sg` refers to the subgroup tile layout. This is enabled only for BF16/FP16 kernels. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can please point out where we are reading this env value to update the tile descriptions? |
||
|
|
||
| Sample JSON file that may be used for adding tile shapes. | ||
| ``` | ||
| [{"wg":[512, 256, 32],"sg":[8,4,1]}, | ||
| {"wg":[256, 128, 16],"sg":[8,4,1]}] | ||
| ``` | ||
| > Note: This feature is meant for advanced users and should be used only if the existing tile shapes don't match desired performance. We recommend you first validate and benchmark any custom tile shapes with SYCL-TLA GEMM examples which can be found [here](../examples/). | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would also remove example comment since we don't really validate example for performance. |
||
| Please note additional tile shapes also increase the torch inductor's autotune benchmarking duration. | ||
|
|
||
|
|
||
| #### Installation | ||
|
|
||
| Stable releases of the SYCL*TLA Python interface are available via the `sycl-tla` PyPI package. | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: omplete ->complete