-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistency of CMSIS-NN Quantization Method(Q-format) with ARM Documentation #115
Comments
Hi @LEE-SEON-WOO, |
Hello @mansnils , thank you for your response. Unfortunately, it seems that my limited fluency in English may have caused some misunderstandings, for which I apologize for any confusion caused. The primary reason I am raising this issue is because I believe the q-format method of quantization could be effective on MCUs as well, and I am curious about the reasons it is not supported. For example, I wonder if it is due to difficulties related to accuracy or efficiency that support is not provided. I understand that the quantization method provided by ARM is in the form of Q(m,n), and TFLM provides it based on the formula q = S*r + Z (where q: Quant Value, S: Scale Factor, Z: Zero Point, r: real_value). From my experience using ARM's method, it seems to have several advantages. Firstly, it operates at a higher speed because it uses shift operations. Secondly, it allows for smaller additional computations and variable sizes. As I only deal with models suitable for MCUs, I am not sure how well it works with larger models. Also, legacy APIs using the q-format are easily accessible in other open-source platforms like nnom. Thank you. |
Thanks for the link! Why can't NNOM use the new/existing CMSIS-NN API? |
Dear @mansnils, Thank you for your response. Upon reviewing the CMSIS-NN documentation, I confirmed that the presence of _s indicates compatibility with TensorFlow Lite Micro. Additionally, when examining the algorithm, it is evident that functions such as Best regards. |
Hello.
I am currently in the process of developing using the Q-Format (Qm.n) for quantization. However, upon reviewing the revision history, I noticed that starting from version 4.1.0, the q-format approach is no longer being followed. My current approach aligns with the methods outlined in the following ARM documentation links:
While TensorFlow Lite for Microcontrollers employs Zero Point and Scale Factor for quantization, which necessitates additional memory and floating-point operations, it appears that Q-format based quantization would be more suitable for Cortex-M processors due to these constraints.
Could you kindly provide a clear explanation for the necessity of this change? The absence of discussion regarding its impact on speed and accuracy has left me somewhat perplexed. Any insight into the rationale behind this decision would be greatly appreciated, as it would aid in understanding the best practices for quantization within the context of TensorFlow Lite for Microcontrollers and CMSIS-NN.
Thank you for your time and consideration.
The text was updated successfully, but these errors were encountered: