diff --git a/README.md b/README.md index 739648d1527d..79a38b3c95a9 100644 --- a/README.md +++ b/README.md @@ -68,29 +68,30 @@ | 模型系列 | 模型名称 | |:----------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| [LLaMA](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/llama) | facebook/llama-7b, facebook/llama-13b, facebook/llama-30b, facebook/llama-65b, ziqingyang/chinese-llama-7b, ziqingyang/chinese-llama-13b, ziqingyang/chinese-alpaca-7b, ziqingyang/chinese-alpaca-13b, idea-ccnl/ziya-llama-13b-v1 | -| [LLama2](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/llama) | meta-llama/Llama-2-7b, meta-llama/Llama-2-7b-chat, meta-llama/Llama-2-13b, meta-llama/Llama-2-13b-chat, meta-llama/Llama-2-70b, meta-llama/Llama-2-70b-chat | -| [LLama3](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/llama) | meta-llama/Meta-Llama-3-8B, meta-llama/Meta-Llama-3-8B-Instruct, meta-llama/Meta-Llama-3-70B, meta-llama/Meta-Llama-3-70B-Instruct | -| [LLama3.1](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/llama) | meta-llama/Meta-Llama-3.1-8B, meta-llama/Meta-Llama-3.1-8B-Instruct, meta-llama/Meta-Llama-3.1-70B, meta-llama/Meta-Llama-3.1-70B-Instruct, meta-llama/Meta-Llama-3.1-405B, meta-llama/Meta-Llama-3.1-405B-Instruct, meta-llama/Llama-Guard-3-8B | -| [LLama3.2](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/llama) | meta-llama/Llama-3.2-1B, meta-llama/Llama-3.2-1B-Instruct, meta-llama/Llama-3.2-3B, meta-llama/Llama-3.2-3B-Instruct, meta-llama/Llama-Guard-3-1B | -| [Baichuan](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/baichuan) | baichuan-inc/Baichuan-7B, baichuan-inc/Baichuan-13B-Base, baichuan-inc/Baichuan-13B-Chat | -| [Baichuan2](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/baichuan) | baichuan-inc/Baichuan2-7B-Base, baichuan-inc/Baichuan2-7B-Chat, baichuan-inc/Baichuan2-13B-Base, baichuan-inc/Baichuan2-13B-Chat | -| [Bloom](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/bloom) | bigscience/bloom-560m, bigscience/bloom-560m-bf16, bigscience/bloom-1b1, bigscience/bloom-3b, bigscience/bloom-7b1, bigscience/bloomz-560m, bigscience/bloomz-1b1, bigscience/bloomz-3b, bigscience/bloomz-7b1-mt, bigscience/bloomz-7b1-p3, bigscience/bloomz-7b1, bellegroup/belle-7b-2m | -| [ChatGLM](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/chatglm/) | THUDM/chatglm-6b, THUDM/chatglm-6b-v1.1 | -| [ChatGLM2](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/chatglm2) | THUDM/chatglm2-6b | -| [ChatGLM3](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/chatglm2) | THUDM/chatglm3-6b | -| [Gemma](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/gemma) | google/gemma-7b, google/gemma-7b-it, google/gemma-2b, google/gemma-2b-it | -| [Mistral](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/mistral) | mistralai/Mistral-7B-Instruct-v0.3, mistralai/Mistral-7B-v0.1 | -| [Mixtral](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/mixtral) | mistralai/Mixtral-8x7B-Instruct-v0.1 | -| [OPT](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/opt) | facebook/opt-125m, facebook/opt-350m, facebook/opt-1.3b, facebook/opt-2.7b, facebook/opt-6.7b, facebook/opt-13b, facebook/opt-30b, facebook/opt-66b, facebook/opt-iml-1.3b, opt-iml-max-1.3b | -| [Qwen](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/) | qwen/qwen-7b, qwen/qwen-7b-chat, qwen/qwen-14b, qwen/qwen-14b-chat, qwen/qwen-72b, qwen/qwen-72b-chat, | -| [Qwen1.5](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/) | Qwen/Qwen1.5-0.5B, Qwen/Qwen1.5-0.5B-Chat, Qwen/Qwen1.5-1.8B, Qwen/Qwen1.5-1.8B-Chat, Qwen/Qwen1.5-4B, Qwen/Qwen1.5-4B-Chat, Qwen/Qwen1.5-7B, Qwen/Qwen1.5-7B-Chat, Qwen/Qwen1.5-14B, Qwen/Qwen1.5-14B-Chat, Qwen/Qwen1.5-32B, Qwen/Qwen1.5-32B-Chat, Qwen/Qwen1.5-72B, Qwen/Qwen1.5-72B-Chat, Qwen/Qwen1.5-110B, Qwen/Qwen1.5-110B-Chat, Qwen/Qwen1.5-MoE-A2.7B, Qwen/Qwen1.5-MoE-A2.7B-Chat | -| [Qwen2](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/) | Qwen/Qwen2-0.5B, Qwen/Qwen2-0.5B-Instruct, Qwen/Qwen2-1.5B, Qwen/Qwen2-1.5B-Instruct, Qwen/Qwen2-7B, Qwen/Qwen2-7B-Instruct, Qwen/Qwen2-72B, Qwen/Qwen2-72B-Instruct, Qwen/Qwen2-57B-A14B, Qwen/Qwen2-57B-A14B-Instruct | -| [Qwen2-Math](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/) | Qwen/Qwen2-Math-1.5B, Qwen/Qwen2-Math-1.5B-Instruct, Qwen/Qwen2-Math-7B, Qwen/Qwen2-Math-7B-Instruct, Qwen/Qwen2-Math-72B, Qwen/Qwen2-Math-72B-Instruct, Qwen/Qwen2-Math-RM-72B | -| [Qwen2.5](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/) | Qwen/Qwen2.5-0.5B, Qwen/Qwen2.5-0.5B-Instruct, Qwen/Qwen2.5-1.5B, Qwen/Qwen2.5-1.5B-Instruct, Qwen/Qwen2.5-3B, Qwen/Qwen2.5-3B-Instruct, Qwen/Qwen2.5-7B, Qwen/Qwen2.5-7B-Instruct, Qwen/Qwen2.5-14B, Qwen/Qwen2.5-14B-Instruct, Qwen/Qwen2.5-32B, Qwen/Qwen2.5-32B-Instruct, Qwen/Qwen2.5-72B, Qwen/Qwen2.5-72B-Instruct | -| [Qwen2.5-Math](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/) | Qwen/Qwen2.5-Math-1.5B, Qwen/Qwen2.5-Math-1.5B-Instruct, Qwen/Qwen2.5-Math-7B, Qwen/Qwen2.5-Math-7B-Instruct, Qwen/Qwen2.5-Math-72B, Qwen/Qwen2.5-Math-72B-Instruct, Qwen/Qwen2.5-Math-RM-72B | -| [Qwen2.5-Coder](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/) | Qwen/Qwen2.5-Coder-1.5B, Qwen/Qwen2.5-Coder-1.5B-Instruct, Qwen/Qwen2.5-Coder-7B, Qwen/Qwen2.5-Coder-7B-Instruct | -| [Yuan2](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/yuan/) | IEITYuan/Yuan2-2B, IEITYuan/Yuan2-51B, IEITYuan/Yuan2-102B | +| [LLaMA] | facebook/llama-7b, facebook/llama-13b, facebook/llama-30b, facebook/llama-65b, ziqingyang/chinese-llama-7b, ziqingyang/chinese-llama-13b, ziqingyang/chinese-alpaca-7b, ziqingyang/chinese-alpaca-13b, idea-ccnl/ziya-llama-13b-v1 | +| [LLama2] | meta-llama/Llama-2-7b, meta-llama/Llama-2-7b-chat, meta-llama/Llama-2-13b, meta-llama/Llama-2-13b-chat, meta-llama/Llama-2-70b, meta-llama/Llama-2-70b-chat, linly-ai/chinese-llama-2-7b, linly-ai/chinese-llama-2-13b, FlagAlpha/Llama2-Chinese-7b-Chat, FlagAlpha/Llama2-Chinese-13b-Chat | +| [LLama3] | meta-llama/Meta-Llama-3-8B, meta-llama/Meta-Llama-3-8B-Instruct, meta-llama/Meta-Llama-3-70B, meta-llama/Meta-Llama-3-70B-Instruct | +| [LLama3.1] | meta-llama/Meta-Llama-3.1-8B, meta-llama/Meta-Llama-3.1-8B-Instruct, meta-llama/Meta-Llama-3.1-70B, meta-llama/Meta-Llama-3.1-70B-Instruct, meta-llama/Meta-Llama-3.1-405B, meta-llama/Meta-Llama-3.1-405B-Instruct, meta-llama/Llama-Guard-3-8B | +| [LLama3.2] | meta-llama/Llama-3.2-1B, meta-llama/Llama-3.2-1B-Instruct, meta-llama/Llama-3.2-3B, meta-llama/Llama-3.2-3B-Instruct, meta-llama/Llama-Guard-3-1B | +| [Baichuan] | baichuan-inc/Baichuan-7B, baichuan-inc/Baichuan-13B-Base, baichuan-inc/Baichuan-13B-Chat | +| [Baichuan2] | baichuan-inc/Baichuan2-7B-Base, baichuan-inc/Baichuan2-7B-Chat, baichuan-inc/Baichuan2-13B-Base, baichuan-inc/Baichuan2-13B-Chat | +| [Bloom] | bigscience/bloom-560m, bigscience/bloom-560m-bf16, bigscience/bloom-1b1, bigscience/bloom-3b, bigscience/bloom-7b1, bigscience/bloomz-560m, bigscience/bloomz-1b1, bigscience/bloomz-3b, bigscience/bloomz-7b1-mt, bigscience/bloomz-7b1-p3, bigscience/bloomz-7b1, bellegroup/belle-7b-2m | +| [ChatGLM] | THUDM/chatglm-6b, THUDM/chatglm-6b-v1.1 | +| [ChatGLM2] | THUDM/chatglm2-6b | +| [ChatGLM3] | THUDM/chatglm3-6b | +| [Gemma] | google/gemma-7b, google/gemma-7b-it, google/gemma-2b, google/gemma-2b-it | +| [Mistral] | mistralai/Mistral-7B-Instruct-v0.3, mistralai/Mistral-7B-v0.1 | +| [Mixtral] | mistralai/Mixtral-8x7B-Instruct-v0.1 | +| [OPT] | facebook/opt-125m, facebook/opt-350m, facebook/opt-1.3b, facebook/opt-2.7b, facebook/opt-6.7b, facebook/opt-13b, facebook/opt-30b, facebook/opt-66b, facebook/opt-iml-1.3b, opt-iml-max-1.3b | +| [Qwen] | qwen/qwen-7b, qwen/qwen-7b-chat, qwen/qwen-14b, qwen/qwen-14b-chat, qwen/qwen-72b, qwen/qwen-72b-chat, | +| [Qwen1.5] | Qwen/Qwen1.5-0.5B, Qwen/Qwen1.5-0.5B-Chat, Qwen/Qwen1.5-1.8B, Qwen/Qwen1.5-1.8B-Chat, Qwen/Qwen1.5-4B, Qwen/Qwen1.5-4B-Chat, Qwen/Qwen1.5-7B, Qwen/Qwen1.5-7B-Chat, Qwen/Qwen1.5-14B, Qwen/Qwen1.5-14B-Chat, Qwen/Qwen1.5-32B, Qwen/Qwen1.5-32B-Chat, Qwen/Qwen1.5-72B, Qwen/Qwen1.5-72B-Chat, Qwen/Qwen1.5-110B, Qwen/Qwen1.5-110B-Chat, Qwen/Qwen1.5-MoE-A2.7B, Qwen/Qwen1.5-MoE-A2.7B-Chat | +| [Qwen2] | Qwen/Qwen2-0.5B, Qwen/Qwen2-0.5B-Instruct, Qwen/Qwen2-1.5B, Qwen/Qwen2-1.5B-Instruct, Qwen/Qwen2-7B, Qwen/Qwen2-7B-Instruct, Qwen/Qwen2-72B, Qwen/Qwen2-72B-Instruct, Qwen/Qwen2-57B-A14B, Qwen/Qwen2-57B-A14B-Instruct | +| [Qwen2-Math] | Qwen/Qwen2-Math-1.5B, Qwen/Qwen2-Math-1.5B-Instruct, Qwen/Qwen2-Math-7B, Qwen/Qwen2-Math-7B-Instruct, Qwen/Qwen2-Math-72B, Qwen/Qwen2-Math-72B-Instruct, Qwen/Qwen2-Math-RM-72B | +| [Qwen2.5] | Qwen/Qwen2.5-0.5B, Qwen/Qwen2.5-0.5B-Instruct, Qwen/Qwen2.5-1.5B, Qwen/Qwen2.5-1.5B-Instruct, Qwen/Qwen2.5-3B, Qwen/Qwen2.5-3B-Instruct, Qwen/Qwen2.5-7B, Qwen/Qwen2.5-7B-Instruct, Qwen/Qwen2.5-14B, Qwen/Qwen2.5-14B-Instruct, Qwen/Qwen2.5-32B, Qwen/Qwen2.5-32B-Instruct, Qwen/Qwen2.5-72B, Qwen/Qwen2.5-72B-Instruct | +| [Qwen2.5-Math] | Qwen/Qwen2.5-Math-1.5B, Qwen/Qwen2.5-Math-1.5B-Instruct, Qwen/Qwen2.5-Math-7B, Qwen/Qwen2.5-Math-7B-Instruct, Qwen/Qwen2.5-Math-72B, Qwen/Qwen2.5-Math-72B-Instruct, Qwen/Qwen2.5-Math-RM-72B | +| [Qwen2.5-Coder] | Qwen/Qwen2.5-Coder-1.5B, Qwen/Qwen2.5-Coder-1.5B-Instruct, Qwen/Qwen2.5-Coder-7B, Qwen/Qwen2.5-Coder-7B-Instruct | +| [Yuan2] | IEITYuan/Yuan2-2B, IEITYuan/Yuan2-51B, IEITYuan/Yuan2-102B | + * 4D 并行和算子优化已支持 LLaMA 系列、Baichuan 系列、Bloom 系列、ChatGLM 系列、Gemma 系列、Mistral 系列、OPT 系列和 Qwen 系列,【LLM】模型4D 并行和算子支持列表如下: @@ -119,18 +120,18 @@ | Model | Pretrain | SFT | LoRA | FlashMask | Prefix Tuning | DPO/SimPO/ORPO | RLHF | Quantization | |--------------------------------------------|:--------:|:---:|:----:|:---------:|:-------------:|:--------------:|:----:|:------------:| -| [Llama](./llm/config/llama) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | -| [Qwen](./llm/config/qwen) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | 🚧 | -| [Mixtral](./llm/config/mixtral) | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | 🚧 | 🚧 | -| [Mistral](./llm/config/mistral) | ✅ | ✅ | ✅ | 🚧 | ✅ | ✅ | 🚧 | 🚧 | -| [Baichuan/Baichuan2](./llm/config/llama) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | ✅ | -| [ChatGLM-6B](./llm/config/chatglm) | ✅ | ✅ | ✅ | 🚧 | ✅ | 🚧 | 🚧 | ✅ | -| [ChatGLM2/ChatGLM3](./llm/config/chatglm2) | ✅ | ✅ | ✅ | 🚧 | ✅ | ✅ | 🚧 | ✅ | -| [Bloom](./llm/config/bloom) | ✅ | ✅ | ✅ | 🚧 | ✅ | 🚧 | 🚧 | ✅ | -| [GPT-3](./llm/config/gpt-3) | ✅ | ✅ | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 | -| [OPT](./llm/config/opt) | ✅ | ✅ | ✅ | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 | -| [Gemma](./llm/config/gemma) | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | 🚧 | 🚧 | -| [Yuan](./llm/config/yuan) | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | 🚧 | 🚧 | +| [Llama] | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | +| [Qwen] | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | 🚧 | +| [Mixtral] | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | 🚧 | 🚧 | +| [Mistral] | ✅ | ✅ | ✅ | 🚧 | ✅ | ✅ | 🚧 | 🚧 | +| [Baichuan/Baichuan2] | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | ✅ | +| [ChatGLM-6B] | ✅ | ✅ | ✅ | 🚧 | ✅ | 🚧 | 🚧 | ✅ | +| [ChatGLM2/ChatGLM3] | ✅ | ✅ | ✅ | 🚧 | ✅ | ✅ | 🚧 | ✅ | +| [Bloom] | ✅ | ✅ | ✅ | 🚧 | ✅ | 🚧 | 🚧 | ✅ | +| [GPT-3] | ✅ | ✅ | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 | +| [OPT] | ✅ | ✅ | ✅ | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 | +| [Gemma] | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | 🚧 | 🚧 | +| [Yuan] | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | 🚧 | 🚧 | * [大模型推理](./llm/docs/predict/inference.md)已支持 LLaMA 系列、Qwen 系列、Mistral 系列、ChatGLM 系列、Bloom 系列和 Baichuan 系列,支持 Weight Only INT8及 INT4推理,支持 WAC(权重、激活、Cache KV)进行 INT8、FP8量化的推理,【LLM】模型推理支持列表如下: diff --git a/llm/README.md b/llm/README.md index 7c297f752aea..a0c7d662bc3b 100644 --- a/llm/README.md +++ b/llm/README.md @@ -17,18 +17,18 @@ | Model | Pretrain | SFT | LoRA | Prefix Tuning | DPO/SimPO/ORPO | RLHF | Quantization | Torch convert | |----------------------------------------|----------|-----|------|---------------|-----|------|--------------|---------------| -| [LLaMA](./config/llama) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | -| [Qwen](./config/qwen) | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | -| [Mixtral](./config/mixtral) | ✅ | ✅ | ✅ | ❌ | ✅ | 🚧 | 🚧 | 🚧 | -| [Mistral](./config/mistral) | ❌ | ✅ | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | -| [Baichuan/Baichuan2](./config/llama) | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | ✅ | ✅ | -| [ChatGLM-6B](./config/chatglm) | ❌ | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | ❌ | -| [ChatGLM2/ChatGLM3](./config/chatglm2) | ❌ | ✅ | ✅ | ✅ | ✅ | 🚧 | ✅ | ✅ | -| [Bloom](./config/bloom) | ❌ | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | ✅ | -| [GPT-3](./config/gpt-3) | ✅ | ✅ | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 | ✅ | -| [OPT](./config/opt) | 🚧 | ✅ | ✅ | 🚧 | 🚧 | 🚧 | 🚧 | ✅ | -| [Gemma](./config/gemma) | 🚧 | ✅ |🚧 | 🚧 | ✅ | 🚧 | 🚧 | 🚧 | -| [Yuan](./config/yuan) | ✅ | ✅ |✅ | 🚧 | ✅ | 🚧 | 🚧 | 🚧 | +| [LLaMA] | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | +| [Qwen] | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | +| [Mixtral] | ✅ | ✅ | ✅ | ❌ | ✅ | 🚧 | 🚧 | 🚧 | +| [Mistral] | ❌ | ✅ | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | +| [Baichuan/Baichuan2] | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | ✅ | ✅ | +| [ChatGLM-6B] | ❌ | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | ❌ | +| [ChatGLM2/ChatGLM3] | ❌ | ✅ | ✅ | ✅ | ✅ | 🚧 | ✅ | ✅ | +| [Bloom] | ❌ | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | ✅ | +| [GPT-3] | ✅ | ✅ | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 | ✅ | +| [OPT] | 🚧 | ✅ | ✅ | 🚧 | 🚧 | 🚧 | 🚧 | ✅ | +| [Gemma] | 🚧 | ✅ |🚧 | 🚧 | ✅ | 🚧 | 🚧 | 🚧 | +| [Yuan] | ✅ | ✅ |✅ | 🚧 | ✅ | 🚧 | 🚧 | 🚧 | - ✅: Supported diff --git a/llm/config/baichuan/README.md b/llm/config/baichuan/README.md deleted file mode 100644 index 8da936f0c1f9..000000000000 --- a/llm/config/baichuan/README.md +++ /dev/null @@ -1,15 +0,0 @@ -# Baichuan - -## 1. 模型介绍 - -**支持模型权重:** - -| Model | -|---------------------------------| -| baichuan-inc/Baichuan-7B | -| baichuan-inc/Baichuan-13B-Base | -| baichuan-inc/Baichuan-13B-Chat | -| baichuan-inc/Baichuan2-7B-Base | -| baichuan-inc/Baichuan2-7B-Chat | -| baichuan-inc/Baichuan2-13B-Base | -| baichuan-inc/Baichuan2-13B-Chat | diff --git a/llm/config/bloom/README.md b/llm/config/bloom/README.md deleted file mode 100644 index 625fa33f2c8b..000000000000 --- a/llm/config/bloom/README.md +++ /dev/null @@ -1,21 +0,0 @@ -# BLOOM - -## 1.模型介绍 - -BLOOM 是一种自回归大型语言模型(LLM),在大量文本数据上训练从而生生成目标文本,同时它能够支持46种语言和13种编程语言的文本交互。BLOOM 主要基于文本生成任务训练而成,可以很好的完成文本续写任务,此外 BloomZ 系列模型加入了 Instruction Tuning。 - -**支持模型权重:** -| Model | -|----------------------------| -| bigscience/bloom-560m | -| bigscience/bloom-560m-bf16 | -| bigscience/bloom-1b1/ | -| bigscience/bloom-3b | -| bigscience/bloom-7b1 | -| bigscience/bloomz-560m/ | -| bigscience/bloomz-1b1 | -| bigscience/bloomz-3b | -| bigscience/bloomz-7b1-mt | -| bigscience/bloomz-7b1-p3 | -| bigscience/bloomz-7b1 | -| bellegroup/belle-7b-2m | diff --git a/llm/config/chatglm/README.md b/llm/config/chatglm/README.md deleted file mode 100644 index 76c4998309f3..000000000000 --- a/llm/config/chatglm/README.md +++ /dev/null @@ -1,16 +0,0 @@ -# ChatGLM-6B - -## 1. 模型介绍 - -ChatGLM-6B 是一个开源的、支持中英双语问答的对话语言模型,基于 [General Language Model (GLM)](https://arxiv.org/abs/2103.10360) 架构,具有 62 亿参数。ChatGLM-6B 使用了和 ChatGLM 相同的技术,针对中文问答和对话进行了优化。经过约 1T 标识符的中英双语训练,辅以监督微调、反馈自助、人类反馈强化学习等技术的加持,62 亿参数的 ChatGLM-6B 已经能生成相当符合人类偏好的回答。 - -**支持模型权重:** - -| Model | -|-----------------------| -| THUDM/chatglm-6b | -| THUDM/chatglm-6b-v1.1 | - -## 2. 模型协议 - -ChatGLM-6B 模型的权重的使用需要遵循[License](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/paddlenlp/transformers/chatglm/LICENSE)。 diff --git a/llm/config/chatglm2/README.md b/llm/config/chatglm2/README.md deleted file mode 100644 index 3e946cc9b30f..000000000000 --- a/llm/config/chatglm2/README.md +++ /dev/null @@ -1,16 +0,0 @@ -# ChatGLM2-6B - -## 1. 模型介绍 - -ChatGLM2-6B 是开源中英双语对话模型 [ChatGLM-6B](https://github.com/THUDM/ChatGLM-6B) 的第二代版本,在保留了初代模型对话流畅、部署门槛较低等众多优秀特性的基础之上,ChatGLM2-6B 引入了[FlashAttention](https://github.com/HazyResearch/flash-attention)和[Multi-Query Attention](https://arxiv.org/abs/1911.02150v1)等新特性。更详细的模型介绍见[ChatGLM2-6B GitHub](https://github.com/THUDM/ChatGLM2-6B) - -**支持模型权重:** - -| Model | -|-------------------| -| THUDM/chatglm2-6b | -| THUDM/chatglm3-6b | - -## 2. 模型协议 - -ChatGLM2-6B 模型的权重的使用需要遵循[License](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/paddlenlp/transformers/chatglm_v2/LICENSE)。 diff --git a/llm/config/gemma/README.md b/llm/config/gemma/README.md deleted file mode 100644 index c3e8434dc44f..000000000000 --- a/llm/config/gemma/README.md +++ /dev/null @@ -1,18 +0,0 @@ -# Gemma - -## 1.模型介绍 - -[Gemma](https://blog.google/technology/developers/gemma-open-models/) 由谷歌 DeepMind 和谷歌其他团队开发,是一个轻量级、最先进的开放式模型家族,采用与 Gemini 模型相同的研究和技术构建。 - -**支持模型权重:** - -| Model | -|--------------------| -| google/gemma-7b | -| google/gemma-7b-it | -| google/gemma-2b | -| google/gemma-2b-it | - -## 2. 模型精调 - -请参考[LLM 全流程工具介绍](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm) diff --git a/llm/config/gpt-3/README.md b/llm/config/gpt-3/README.md deleted file mode 100644 index 472c2f74cd42..000000000000 --- a/llm/config/gpt-3/README.md +++ /dev/null @@ -1,5 +0,0 @@ -# GPT - -## 1. 模型介绍 - -GPT-3是一种预训练语言模型,它能够模拟人类语言思维和表达。GPT-3拥有巨大的参数,包含了1750亿个参数,这使得它具有强大的语言理解和生成能力。它可以完成的任务包括文本生成、文本摘要、回答问题、翻译、阅读理解等。GPT-3的预训练过程使用了大量的语料库,包括互联网上的大量文本。它通过分析这些文本,学习如何生成和理解人类语言。GPT-3在自然语言处理领域具有很高的影响力,它可以模拟人类对话和生成文本,这使得它在许多应用领域都有广泛的应用,比如智能客服、自然语言处理、游戏设计等。 diff --git a/llm/config/llama/README.md b/llm/config/llama/README.md deleted file mode 100644 index 42b6e8f8f9df..000000000000 --- a/llm/config/llama/README.md +++ /dev/null @@ -1,52 +0,0 @@ -# LLaMA - -## 1. 模型介绍 - -**支持模型权重:** - -| Model | -|--------------------------------------| -| facebook/llama-7b | -| facebook/llama-13b | -| facebook/llama-30b | -| facebook/llama-65b | -| meta-llama/Llama-2-7b | -| meta-llama/Llama-2-7b-chat | -| meta-llama/Llama-2-13b | -| meta-llama/Llama-2-13b-chat | -| meta-llama/Llama-2-70b | -| meta-llama/Llama-2-70b-chat | -| meta-llama/Meta-Llama-3-8B | -| meta-llama/Meta-Llama-3-8B-Instruct | -| meta-llama/Meta-Llama-3-70B | -| meta-llama/Meta-Llama-3-70B-Instruct | -| ziqingyang/chinese-llama-7b | -| ziqingyang/chinese-llama-13b | -| ziqingyang/chinese-alpaca-7b | -| ziqingyang/chinese-alpaca-13b | -| idea-ccnl/ziya-llama-13b-v1 | -| linly-ai/chinese-llama-2-7b | -| linly-ai/chinese-llama-2-13b | -| baichuan-inc/Baichuan-7B | -| baichuan-inc/Baichuan-13B-Base | -| baichuan-inc/Baichuan-13B-Chat | -| baichuan-inc/Baichuan2-7B-Base | -| baichuan-inc/Baichuan2-7B-Chat | -| baichuan-inc/Baichuan2-13B-Base | -| baichuan-inc/Baichuan2-13B-Chat | -| FlagAlpha/Llama2-Chinese-7b-Chat | -| FlagAlpha/Llama2-Chinese-13b-Chat | - -使用方法: - -```python -from paddlenlp.transformers import AutoModelForCausalLM, AutoTokenizer -model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-chat") -tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat") -``` - -## 2. 模型协议 - -LLaMA 模型的权重的使用则需要遵循[License](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/paddlenlp/transformers/llama/LICENSE)。 - -Llama2 模型的权重的使用则需要遵循[License](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/paddlenlp/transformers/llama/Llama2.LICENSE)。 diff --git a/llm/config/mistral/README.md b/llm/config/mistral/README.md deleted file mode 100644 index 64c6a0f8f8f0..000000000000 --- a/llm/config/mistral/README.md +++ /dev/null @@ -1,18 +0,0 @@ -# Mistral - -## 1. 模型介绍 - -**支持模型权重:** - -| Model | -|--------------------------------------| -| mistralai/Mistral-7B-Instruct-v0.3 | -| mistralai/Mistral-7B-v0.1 | - -使用方法: - -```python -from paddlenlp.transformers import AutoModelForCausalLM, AutoTokenizer -model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.3") -tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.3") -``` diff --git a/llm/config/mixtral/README.md b/llm/config/mixtral/README.md deleted file mode 100644 index b0f80c132bcd..000000000000 --- a/llm/config/mixtral/README.md +++ /dev/null @@ -1,17 +0,0 @@ -# Mixtral - -## 1. 模型介绍 - -**支持模型权重:** - -| Model | -|--------------------------------------| -| mistralai/Mixtral-8x7B-Instruct-v0.1 | - -使用方法: - -```python -from paddlenlp.transformers import AutoModelForCausalLM, AutoTokenizer -model = AutoModelForCausalLM.from_pretrained("mistralai/Mixtral-8x7B-Instruct-v0.1") -tokenizer = AutoTokenizer.from_pretrained("mistralai/Mixtral-8x7B-Instruct-v0.1") -``` diff --git a/llm/config/opt/README.md b/llm/config/opt/README.md deleted file mode 100644 index 7f24e921d81d..000000000000 --- a/llm/config/opt/README.md +++ /dev/null @@ -1,19 +0,0 @@ -# OPT - -## 1. 模型介绍 - -[OPT: Open Pre-trained Transformer Language Models](https://arxiv.org/abs/2205.01068) 是以自回归填空作为训练目标的通用语言模型,可用于各类理解和生成任务。 - -**支持模型权重:** -| Model | -|-----------------------| -| facebook/opt-125m | -| facebook/opt-350m | -| facebook/opt-1.3b | -| facebook/opt-2.7b | -| facebook/opt-6.7b | -| facebook/opt-13b | -| facebook/opt-30b | -| facebook/opt-66b | -| facebook/opt-iml-1.3b | -| opt-iml-max-1.3b | diff --git a/llm/config/qwen/README.md b/llm/config/qwen/README.md deleted file mode 100644 index 4684e17f2633..000000000000 --- a/llm/config/qwen/README.md +++ /dev/null @@ -1,54 +0,0 @@ -# Qwen - -## 1.模型介绍 - -[通义千问(Qwen)](https://arxiv.org/abs/2205.01068) 是阿里云研发的通义千问大模型系列的模型, 有 70 亿和 140 亿两个规模。Qwen 是基于 Transformer 的大语言模型, 在超大规模的预训练数据上进行训练得到。预训练数据类型多样,覆盖广泛,包括大量网络文本、专业书籍、代码等。 - -**支持模型权重:** -| Model | -|--------------------| -| qwen/qwen-7b | -| qwen/qwen-7b-chat | -| qwen/qwen-14b | -| qwen/qwen-14b-chat | -| qwen/qwen-72b | -| qwen/qwen-72b-chat | - -[通义千问(Qwen1.5)](https://qwenlm.github.io/blog/qwen1.5/) 是阿里云研发的通义千问系列模型升级版。Qwen1.5包括0.5B、1.8B、4B、7B、14B、32B、72B、110B 和 MoE 共计9个不同规模的 Base 和 Chat 模型。 - -**支持模型权重:** -| Model (qwen-1.5) | -|-----------------------------| -| Qwen/Qwen1.5-0.5B | -| Qwen/Qwen1.5-0.5B-Chat | -| Qwen/Qwen1.5-1.8B | -| Qwen/Qwen1.5-1.8B-Chat | -| Qwen/Qwen1.5-4B | -| Qwen/Qwen1.5-4B-Chat | -| Qwen/Qwen1.5-7B | -| Qwen/Qwen1.5-7B-Chat | -| Qwen/Qwen1.5-14B | -| Qwen/Qwen1.5-14B-Chat | -| Qwen/Qwen1.5-32B | -| Qwen/Qwen1.5-32B-Chat | -| Qwen/Qwen1.5-72B | -| Qwen/Qwen1.5-72B-Chat | -| Qwen/Qwen1.5-110B | -| Qwen/Qwen1.5-110B-Chat | -| Qwen/Qwen1.5-MoE-A2.7B | -| Qwen/Qwen1.5-MoE-A2.7B-Chat | - -[通义千问(Qwen2)](https://qwenlm.github.io/blog/qwen2/) 是阿里云研发的通义千问系列模型升级版。Qwen2包括0.5B、1.5B、7B、72B 和 MoE 共计5个不同规模的 Base 和 Chat 模型。 -**支持模型权重:** -| Model (qwen2) | -|------------------------------| -| Qwen/Qwen2-0.5B | -| Qwen/Qwen2-0.5B-Instruct | -| Qwen/Qwen2-1.5B | -| Qwen/Qwen2-1.5B-Instruct | -| Qwen/Qwen2-7B | -| Qwen/Qwen2-7B-Instruct | -| Qwen/Qwen2-72B | -| Qwen/Qwen2-72B-Instruct | -| Qwen/Qwen2-57B-A14B | -| Qwen/Qwen2-57B-A14B-Instruct | diff --git a/llm/config/yuan/README.md b/llm/config/yuan/README.md deleted file mode 100644 index c7c02f9ae2fc..000000000000 --- a/llm/config/yuan/README.md +++ /dev/null @@ -1,83 +0,0 @@ -# 源2.0 - -## 1. 模型介绍 - -[源2.0](https://github.com/IEIT-Yuan/Yuan-2.0)是浪潮信息发布的新一代基础语言大模型。源2.0是在源1.0的基础上,利用更多样的高质量预训练数据和指令微调数据集,令模型在语义、数学、推理、代码、知识等不同方面具备更强的理解能力。 - -目前源2.0对 PaddlePaddle 的适配仅支持数据并行和张量并行,后续功能正在开发中。 - -**支持模型权重:** - -| Model | -|-------------------| -| IEITYuan/Yuan2-2B | -| IEITYuan/Yuan2-51B | -| IEITYuan/Yuan2-102B | - -## 2. 推理介绍 - -### · 2B - -推理脚本如下 : - -```python -from paddlenlp.transformers import AutoModelForCausalLM, AutoTokenizer -model_path = "IEITYuan/Yuan2-2B" -tokenizer = AutoTokenizer.from_pretrained(model_path) -model = AutoModelForCausalLM.from_pretrained(model_path, dtype="bfloat16") -model.eval() -input_features = tokenizer("青岛推荐去哪玩?", return_tensors="pd") -print("问题:", tokenizer.batch_decode(input_features["input_ids"])) -outputs = model.generate(**input_features, do_sample=False, max_length=1024) -print("回答:", tokenizer.batch_decode(outputs[0])) -# 青岛是中国著名的旅游城市,有许多著名的景点和活动。以下是一些值得推荐的地方:\n1. 栈桥:栈桥是青岛的象征之一,是八大关风景区的一部分。在这里可以欣赏到美丽的海岸线和壮观的城市风光。\n2. 青岛啤酒博物馆:这座博物馆位于崂山山顶上,可以欣赏到美丽的海景和壮观的城市景象。\n3. 八大关风景区:这里有许多知名的景点,如栈桥、音乐广场、青岛啤酒博物馆等。\n4. 青岛奥帆中心:这个帆船比赛已经在青岛成功举办了两届,是青岛市民的一项重要活动。\n5. 青岛老街:这里有丰富的历史和独特的建筑风格,还有许多小摊贩可以帮助游客找到纪念品。\n6. 海底世界:崂山是中国最大的海底岩洞,这里可以看到美丽的珊瑚和各种鱼类。\n7. 崂山风景名胜区:这个区域被联合国教科文组织列为世界遗产地,有丰富的自然和文化资源。\n无论您选择哪个地方,都可以欣赏到美丽的景色和体验到丰富的文化活动。希望您有机会去青岛旅游! -``` - -### · 51B - -推理脚本如下 : - -```bash -export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 -python -m paddle.distributed.launch \ - --devices "0,1,2,3,4,5,6,7" \ - test_tp.py -``` - -test_tp.py : - -```python -from paddle.distributed import fleet -from paddlenlp.transformers import AutoTokenizer, AutoModelForCausalLM -strategy = fleet.DistributedStrategy() -strategy.hybrid_configs = { - "dp_degree": 1, - "mp_degree": 8, - "pp_degree": 1, - "sharding_degree": 1, - } -fleet.init(is_collective=True, strategy=strategy) -hcg = fleet.get_hybrid_communicate_group() -tensor_parallel_rank = hcg.get_model_parallel_rank() -model_path = "IEITYuan/Yuan2-51B" -tokenizer = AutoTokenizer.from_pretrained(model_path, add_eos_token=False, add_bos_token=False, eos_token='') -tokenizer.add_tokens(['', '', '', '', '', '', '','','','','','','','',''], special_tokens=True) -model = AutoModelForCausalLM.from_pretrained(model_path, tensor_parallel_degree= 8, tensor_parallel_rank=tensor_parallel_rank, dtype="bfloat16") -model.eval() -input_features = tokenizer("厦门推荐去哪玩?", return_tensors="pd") -print("问题:", tokenizer.batch_decode(input_features["input_ids"])) -outputs = model.generate(**input_features, do_sample=False, max_length=1024) -print("回答:", tokenizer.batch_decode(outputs[0])) -``` - -### · 102B - -与51B 模型的推理脚本一致 - -## 3. 预训练介绍 - -请参考[LLM 全流程工具介绍](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm) - -## 4. 微调介绍 - -请参考[LLM 全流程工具介绍](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm)