We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
按照目前PAI适配的方法,有这样的疑问:为什么不考虑适配low_freq_factor / high_freq_factor?
The text was updated successfully, but these errors were encountered:
我们内部实现了这一参数,但并没有提供外部接口,如果您有需求,请对相应代码进行修改,详见
Pai-Megatron-Patch/megatron_patch/model/llama3_1/model.py
Line 127 in 9d3e557
Sorry, something went wrong.
No branches or pull requests
按照目前PAI适配的方法,有这样的疑问:为什么不考虑适配low_freq_factor / high_freq_factor?
"rope_scaling": {
"factor": 8.0,
"low_freq_factor": 1.0,
"high_freq_factor": 4.0,
"original_max_position_embeddings": 8192,
"rope_type": "llama3"
},
这里的low_freq_factor / high_freq_factor缩放,在对Attn进行位置编码时,high_freq_factor=4肯定会产生对高维度编码的影响,如何在实际对Llama 3.1模型进行训练时,进行上述参数适配,求解。
The text was updated successfully, but these errors were encountered: