-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZeroPointDomain as an arguments #1264
Comments
makes sense, I think we can just expose zero_point_domain as an argument, it now has 3 options: ao/torchao/quantization/quant_primitives.py Lines 74 to 76 in 2ba1a61
|
we can do that, but the packing format of int4_weight_only is not normal. |
Hi @HDCharles could you give more details about "normal"? |
@HDCharles The packing of scales and zero points into one tensor might be the limit of - func: _weight_int4pack_mm(Tensor self, Tensor mat2, int qGroupSize, Tensor qScaleAndZeros) -> Tensor
- func: _weight_int4pack_mm_with_scale_and_zeros(Tensor self, Tensor mat2, int qGroupSize, Tensor qScale, Tensor qZeros) -> Does it make sense to you? cc @jgong5 |
this is easy to do with a different layout, like #1278, you can use a different op as well if packing format is different. what backend are you planning to develop? xpu? |
|
OK please let us know your plan and if #1278 is enough to address the concern here |
Context
Current ZeroPointDomain is bound to the layout
ao/torchao/quantization/quant_api.py
Lines 607 to 615 in 2ba1a61
Ideally, we should allow the data types of zero points to be specified as arguments. There are two main benefits:
Proposals
Add an optional argument to let users specify the data types of zero points:
Meanwhile we will overload
_weight_int4pack_mm
with zero points and scales as separate tensors.An example usage
The text was updated successfully, but these errors were encountered: