我发现长文本生成效果不容易调出来，特别容易出现重复例如“我吃饭了吗吗吗吗吗吗吗吗吗” #1378

liuanping · 2023-04-15T11:55:11Z

liuanping
Apr 15, 2023

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

我发现长文本生成效果不容易调出来，特别容易出现重复例如“我吃饭了吗吗吗吗吗吗吗吗吗”，网上说是退化问题，即随着生成文本长度的增加其质量会逐渐降低，容易出现多种层次（字、短语、句子级）的重复生成。有没有大神给一些有效的经验。

Expected Behavior

求大神指点

Steps To Reproduce

训练长文本生成。

Environment

- OS:ubuntu-18
- Python:3.8.13
- Transformers:4.28
- PyTorch:2.2
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :11.7

Anything else?

none

StarRanger · 2023-04-15T12:40:26Z

StarRanger
Apr 15, 2023

你用的p-tuning吗？还是lora？

0 replies

liuanping · 2023-04-16T08:19:29Z

liuanping
Apr 16, 2023
Author

你用的p-tuning吗？还是lora？
用的是lora

0 replies

Crazycatter · 2023-04-17T09:01:58Z

Crazycatter
Apr 17, 2023

我也是这个问题减少文本长度有用么？

0 replies

xv994 · 2023-04-18T07:41:28Z

xv994
Apr 18, 2023

我在alpaca-lora遇到了同样的问题，猜测生成长文本的难度确实较大，而且lora在微调方法中也算是属于效果不太好的那一类了

0 replies

Leawnn · 2023-04-20T02:21:52Z

Leawnn
Apr 20, 2023

我用p-tuning也是出现这种
求大神解答

0 replies

StarRanger · 2023-04-20T03:11:31Z

StarRanger
Apr 20, 2023

max_target_length别指定太长，缩短到64，训练次数适当增加。部署时，tockenizer就用原本的Thudm/ChatGLM-6B，model用ptuning的model，你这情况我在训练llama时遇到过。chatglm没遇见。发自我的iPhone

…

在 2023年4月20日，10:22，Leawnn ***@***.***> 写道：我用p-tuning也是出现这种求大神解答 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.

0 replies

Leawnn · 2023-04-20T07:28:13Z

Leawnn
Apr 20, 2023

训练的时候max_target_length设置的300，max_steps设置5000，还是不太行

0 replies

Leawnn · 2023-04-20T07:39:25Z

Leawnn
Apr 20, 2023

0 replies

xv994 · 2023-04-20T08:43:45Z

xv994
Apr 20, 2023

max_target_length为128时，生成中文文本字数大概为200字左右，你这个max_target_length=300，我猜测如果基底模型生成的内容不够长的话，就会不停重复

0 replies

SnakeHacker · 2023-04-29T13:51:35Z

SnakeHacker
Apr 29, 2023

同样遇到了，有时候会疯狂排比句

0 replies

liuanping · 2023-05-02T03:19:19Z

liuanping
May 2, 2023
Author

用全量微调感觉没有这种问题了

0 replies

Lufffya · 2023-06-13T02:48:18Z

Lufffya
Jun 13, 2023

@liuanping 大佬解决了吗，我也出现了同样的问题

0 replies

liuanping · 2023-06-13T03:18:27Z

liuanping
Jun 13, 2023
Author

感觉全量微调就好了

0 replies

liuanping · 2023-06-13T03:19:21Z

liuanping
Jun 13, 2023
Author

@Lufffya 感觉全量微调就好了还有一些办法说是加重复惩罚项

0 replies

Lufffya · 2023-06-13T03:46:13Z

Lufffya
Jun 13, 2023

@Lufffya 感觉全量微调就好了还有一些办法说是加重复惩罚项

哦哦好吧，那估计不太行，我这里只有一张4090，跑不起来，谢谢

0 replies

liuanping · 2023-06-13T06:54:56Z

liuanping
Jun 13, 2023
Author

@Lufffya 可能chatuan更友好因为他是10亿参数模型效果也还行。

0 replies

Chevalier1024 · 2023-06-20T06:37:55Z

Chevalier1024
Jun 20, 2023

@Lufffya 感觉全量微调就好了还有一些办法说是加重复惩罚项

请问你有全量微调嘛

0 replies

shuanglong520 · 2023-06-30T02:08:39Z

shuanglong520
Jun 30, 2023

@liuanping 大佬解决了吗，我也出现了同样的问题
有没有怎么微调相关的说明，谢谢

0 replies

liuanping · 2023-06-30T14:40:01Z

liuanping
Jun 30, 2023
Author

@shuanglong520 我全量微调的时候发现没了 lora不行可能

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

我发现长文本生成效果不容易调出来，特别容易出现重复例如“我吃饭了吗吗吗吗吗吗吗吗吗” #1378

{{title}}

Replies: 19 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

我发现长文本生成效果不容易调出来，特别容易出现重复 例如“我吃饭了吗吗吗吗吗吗吗吗吗” #1378

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

Replies: 19 comments

liuanping Apr 16, 2023 Author

liuanping May 2, 2023 Author

liuanping Jun 13, 2023 Author

liuanping Jun 13, 2023 Author

liuanping Jun 13, 2023 Author

liuanping Jun 30, 2023 Author

我发现长文本生成效果不容易调出来，特别容易出现重复例如“我吃饭了吗吗吗吗吗吗吗吗吗” #1378

liuanping
Apr 16, 2023
Author

liuanping
May 2, 2023
Author

liuanping
Jun 13, 2023
Author

liuanping
Jun 13, 2023
Author

liuanping
Jun 13, 2023
Author

liuanping
Jun 30, 2023
Author