You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Similar to affine quantization, we can implement codebook or look up table based quantization, which is another popular type of quantization, especially for lower bits like 4 bits or below (used in https://github.com/Vahe1994/AQLM, https://arxiv.org/abs/2402.04396 etc.). We can start with post training quantization and use k-means clustering to find the codebook / lookup table. You can check out #391 for the overall structure of torchao stack. Reference code for k-means can be found here.
After this we can also add more support for the advanced algorithms mentioned above.
Hey @jerryzh168 I am new to torchao but this sounds like an issue I would want to investigate with my partner @Harthi7. We will take a look and let you know how it goes. Cheers!
Similar to affine quantization, we can implement codebook or look up table based quantization, which is another popular type of quantization, especially for lower bits like 4 bits or below (used in https://github.com/Vahe1994/AQLM, https://arxiv.org/abs/2402.04396 etc.). We can start with post training quantization and use k-means clustering to find the codebook / lookup table. You can check out #391 for the overall structure of torchao stack. Reference code for k-means can be found here.
After this we can also add more support for the advanced algorithms mentioned above.
API
Implementation details:
Needs to flesh out the details of args etc. but can be done in the PR. I'd suggest to gradually add things and gather feedback.
Code Location: add a
codebook
folder under https://github.com/pytorch/ao/tree/main/torchao/prototype/quantizationThe text was updated successfully, but these errors were encountered: