This repository has been archived by the owner on Apr 18, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 124
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 387ee6b
Showing
79 changed files
with
11,378 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
*.pyc | ||
data/* | ||
__pycache__ | ||
src/generate.py | ||
forced_alignment/* | ||
utils/* | ||
.ipynb_checkpoints/ | ||
tools/montreal-forced-aligner_linux.tar.gz | ||
tools/montreal-forced-aligner | ||
misc/thchs30.zip | ||
docs/_build/* | ||
thchs30_250_demo.tar.gz | ||
docs/.vscode |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
MIT License | ||
|
||
Copyright (c) 2017-2018 Jackiexiao <[email protected]> | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining | ||
a copy of this software and associated documentation files (the | ||
"Software"), to deal in the Software without restriction, including | ||
without limitation the rights to use, copy, modify, merge, publish, | ||
distribute, sublicense, and/or sell copies of the Software, and to | ||
permit persons to whom the Software is furnished to do so, subject to | ||
the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be | ||
included in all copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, | ||
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF | ||
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND | ||
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE | ||
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION | ||
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION | ||
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
# MTTS Mandarin/Chinese Text to Speech FrontEnd | ||
|
||
[English README](https://github.com/Jackiexiao/MTTS/blob/master/README.md) | ||
中文版本不保证与英文版本同步 | ||
|
||
**ON_DEVELOPMENT** | ||
|
||
Mandarin/Chinese Text to Speech based on statistical parametric speech | ||
synthesis using merlin toolkit | ||
|
||
文档 [MTTS Document](http://mtts.readthedocs.io/zh_CN/latest/#) | ||
|
||
## 使用的数据 | ||
使用了15个小时的音频,但不是开源的音频。你可以使用thchs30的数据来做测试,或者自己录音 | ||
|
||
## 生成音频样例 | ||
使用了训练集内的Label生成语音 https://jackiexiao.github.io/MTTS/ | ||
|
||
我也用thchs30中A11发音人的250句语音训练并合成了音频样例,见上面的网站 | ||
|
||
## 如何复现 | ||
|
||
1. 首先你需要语料库(包含音频文本,韵律标注可以不要) | ||
2. 然后通过这个项目生成HTS Label | ||
3. 使用 [merlin](https://github.com/CSTR-Edinburgh/merlin) 进行训练. 具体见 [Mandarin_Voice](https://github.com/Jackiexiao/MTTS/tree/master/egs/mandarin_voice/s1) | ||
|
||
## 上下文相关标注与问题集 | ||
* [Context related annotation上下文相关标注](https://github.com/Jackiexiao/MTTS/blob/master/misc/mandarin_label.md) | ||
* [Question Set问题集](https://github.com/Jackiexiao/MTTS/blob/master/misc/questions-mandarin.hed) | ||
* [Rules to design a Question Set问题集设计规则](https://github.com/Jackiexiao/MTTS/blob/master/docs/mddocs/question.md) | ||
|
||
|
||
## 安装 | ||
Python : python3.6 | ||
System: linux(tested on ubuntu16.04) | ||
``` | ||
pip install jieba pypinyin | ||
sudo apt-get install libatlas3-base | ||
``` | ||
自行下载下面的文件或者run `bash tools/install_mtts.sh` | ||
Download [montreal-forced-aligner](https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/releases/download/v1.0.0/montreal-forced-aligner_linux.tar.gz) and unzip to directory tools/ | ||
Download acoustic_model | ||
[thchs30.zip](https://github.com/Jackiexiao/MTTS/releases/download/v0.1/thchs30.zip) and copy to directory misc/ | ||
|
||
**测试demo** | ||
``` | ||
bash run_demo.sh | ||
``` | ||
|
||
## 使用方法 | ||
### 1. 使用音频文本生成HTS Label | ||
* Usage: Enter dir `MTTS/src` Run `python mtts.py txtfile wav_directory_path output_directory_path` (Absolute path or relative path) Then you will get HTS label | ||
* 注意:只能含有中文文本,不能有阿拉伯数字或者英文字母 | ||
|
||
**txtfile example** | ||
``` | ||
A_01 这是一段文本 | ||
A_02 这是第二段文本 | ||
``` | ||
**wav_directory example**采样率应大于16khz | ||
``` | ||
--A_01.wav | ||
--A_02.wav | ||
``` | ||
|
||
### 2. 使用音频以及对齐好文本和音频的标注文件 生成 HTS Label | ||
具体使用方法见源代码 | ||
[mandarin_frontend.py](https://github.com/Jackiexiao/MTTS/blob/master/src/mandarin_frontend.py) | ||
|
||
### 3. 使用脚本 egs/mandarin_voice | ||
Copy `MTTS/egs/mandarin_voice` to merlin's according directory | ||
然后看里面的README | ||
|
||
### 4. Forced-alignment 音频文本对齐 | ||
This project use [Montreal-Forced-Aligner](https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner) to do forced alignment | ||
1. We trained the acoustic model using thchs30 dataset, see `misc/thchs30.zip`, the dictionary we use [mandarin_mtts.lexicon](https://github.com/Jackiexiao/MTTS/blob/master/misc/mandarin_mtts.lexicon) | ||
2. If you want to use mfa's (montreal-forced-aligner) pre-trained mandarin model, this is the dictionary you need [mandarin-for-montreal-forced-aligner-pre-trained-model.lexicon](https://github.com/Jackiexiao/MTTS/blob/master/misc/mandarin-for-montreal-forced-aligner-pre-trained-model.lexicon) | ||
|
||
## 韵律标注 | ||
没有韵律标注也可以生成Label | ||
|
||
代码中#0表示词语的边界,#1表示韵律词,#2表示重音,#3表示韵律短语,#4表示语调短语。本项目规定词语比韵律词小,代码里自动进行了调整。当不输入韵律时也能够生成可用的label,不过合成的语音韵律感不强 | ||
|
||
## 贡献者 | ||
* Jackiexiao | ||
* willian56 | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,94 @@ | ||
# MTTS Mandarin/Chinese Text to Speech FrontEnd | ||
|
||
[中文README](https://github.com/Jackiexiao/MTTS/blob/master/README-zh.md) | ||
|
||
**ON_DEVELOPMENT** | ||
|
||
Mandarin/Chinese Text to Speech based on statistical parametric speech | ||
synthesis using merlin toolkit | ||
|
||
Read the document (write in Chinese) at [MTTS Document](http://mtts.readthedocs.io/zh_CN/latest/#) | ||
|
||
## Data | ||
Using 15 hours of wav for a mandarin speech synthesis dataset which is not | ||
open-source, but you can use thchs30 dataset to run the demo (or record wav by | ||
yourself) | ||
|
||
## Generated Samples | ||
Using Training Sets Label to generate wav https://jackiexiao.github.io/MTTS/ | ||
|
||
I also use thchs30 dataset to train (only using 250 wavs for A11 speaker), see | ||
the website above | ||
|
||
## How To Reproduce | ||
1. First, you need data contain wav and txt (prosody mark is optional) | ||
2. Second, generate HTS label using this project | ||
3. Using [merlin](https://github.com/CSTR-Edinburgh/merlin) to train. Specific method see [Mandarin_Voice](https://github.com/Jackiexiao/MTTS/tree/master/egs/mandarin_voice/s1) | ||
|
||
## Context related annotation & Question Set | ||
* [Context related annotation上下文相关标注](https://github.com/Jackiexiao/MTTS/blob/master/misc/mandarin_label.md) | ||
* [Question Set问题集](https://github.com/Jackiexiao/MTTS/blob/master/misc/questions-mandarin.hed) | ||
* [Rules to design a Question Set问题集设计规则](https://github.com/Jackiexiao/MTTS/blob/master/docs/mddocs/question.md) | ||
|
||
## Install | ||
Python : python3.6 | ||
System: linux(tested on ubuntu16.04) | ||
``` | ||
pip install jieba pypinyin | ||
sudo apt-get install libatlas3-base | ||
``` | ||
Download file by yourself or run `bash tools/install_mtts.sh` | ||
Download [montreal-forced-aligner](https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/releases/download/v1.0.0/montreal-forced-aligner_linux.tar.gz) and unzip to directory tools/ | ||
Download acoustic_model | ||
[thchs30.zip](https://github.com/Jackiexiao/MTTS/releases/download/v0.1/thchs30.zip) and copy to directory misc/ | ||
|
||
**Run Demo** | ||
``` | ||
bash run_demo.sh | ||
``` | ||
## Usage | ||
### 1. Generate HTS Label by wav and text | ||
* Usage: Enter dir `MTTS/src` Run `python mtts.py txtfile wav_directory_path output_directory_path` (Absolute path or relative path) Then you will get HTS label | ||
* Attention: Currently only support Chinese Character, txt should not have any | ||
Arabia number or English alphabet | ||
|
||
**txtfile example** | ||
``` | ||
A_01 这是一段文本 | ||
A_02 这是第二段文本 | ||
``` | ||
**wav_directory example**(Sampleing Rate should larger than 16khz) | ||
``` | ||
--A_01.wav | ||
--A_02.wav | ||
``` | ||
|
||
### 2. Generate Label by wav and alignment file | ||
see source code for more information | ||
[mandarin_frontend.py](https://github.com/Jackiexiao/MTTS/blob/master/src/mandarin_frontend.py) | ||
|
||
### 3. Using egs/mandarin_voice script | ||
Copy `MTTS/egs/mandarin_voice` to merlin's according directory, and see README [Mandarin_Voice](https://github.com/Jackiexiao/MTTS/blob/master/egs/mandarin_voice/s1/README.md) | ||
|
||
### 4. Forced-alignment | ||
This project use [Montreal-Forced-Aligner](https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner) to do forced alignment | ||
1. We trained the acoustic model using thchs30 dataset, see `misc/thchs30.zip`, the dictionary we use [mandarin_mtts.lexicon](https://github.com/Jackiexiao/MTTS/blob/master/misc/mandarin_mtts.lexicon) | ||
2. If you want to use mfa's (montreal-forced-aligner) pre-trained mandarin model, this is the dictionary you need [mandarin-for-montreal-forced-aligner-pre-trained-model.lexicon](https://github.com/Jackiexiao/MTTS/blob/master/misc/mandarin-for-montreal-forced-aligner-pre-trained-model.lexicon) | ||
|
||
## Prosody Mark | ||
You can generate HTS Label without prosody mark. we assume that word segment is | ||
smaller than prosodic word(which is adjusted in code) | ||
|
||
"#0","#1", "#2","#3" and "#4" are the prosody labeling symbols. | ||
* #0 stands for word segment | ||
* #1 stands for prosodic word | ||
* #2 stands for stressful word (actually in this project we regrad it as #1) | ||
* #3 stands for prosodic phrase | ||
* #4 stands for intonational phrase | ||
|
||
Improvement of prosody analyse will come soon | ||
|
||
## Contributor | ||
* Jackiexiao | ||
* willian56 | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# Minimal makefile for Sphinx documentation | ||
# | ||
|
||
# You can set these variables from the command line. | ||
SPHINXOPTS = | ||
SPHINXBUILD = sphinx-build | ||
SPHINXPROJ = MTTS | ||
SOURCEDIR = . | ||
BUILDDIR = _build | ||
|
||
# Put it first so that "make" without argument is like "make help". | ||
help: | ||
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) | ||
|
||
.PHONY: help Makefile | ||
|
||
# Catch-all target: route all unknown targets to Sphinx using the new | ||
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). | ||
%: Makefile | ||
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) |
Oops, something went wrong.