Skip to content
This repository has been archived by the owner on Apr 18, 2022. It is now read-only.

Commit

Permalink
rebuild git
Browse files Browse the repository at this point in the history
  • Loading branch information
Jackiexiao committed Mar 3, 2018
0 parents commit 387ee6b
Show file tree
Hide file tree
Showing 79 changed files with 11,378 additions and 0 deletions.
13 changes: 13 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
*.pyc
data/*
__pycache__
src/generate.py
forced_alignment/*
utils/*
.ipynb_checkpoints/
tools/montreal-forced-aligner_linux.tar.gz
tools/montreal-forced-aligner
misc/thchs30.zip
docs/_build/*
thchs30_250_demo.tar.gz
docs/.vscode
22 changes: 22 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
MIT License

Copyright (c) 2017-2018 Jackiexiao <[email protected]>

Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
"Software"), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:

The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
87 changes: 87 additions & 0 deletions README-zh.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# MTTS Mandarin/Chinese Text to Speech FrontEnd

[English README](https://github.com/Jackiexiao/MTTS/blob/master/README.md)
中文版本不保证与英文版本同步

**ON_DEVELOPMENT**

Mandarin/Chinese Text to Speech based on statistical parametric speech
synthesis using merlin toolkit

文档 [MTTS Document](http://mtts.readthedocs.io/zh_CN/latest/#)

## 使用的数据
使用了15个小时的音频,但不是开源的音频。你可以使用thchs30的数据来做测试,或者自己录音

## 生成音频样例
使用了训练集内的Label生成语音 https://jackiexiao.github.io/MTTS/

我也用thchs30中A11发音人的250句语音训练并合成了音频样例,见上面的网站

## 如何复现

1. 首先你需要语料库(包含音频文本,韵律标注可以不要)
2. 然后通过这个项目生成HTS Label
3. 使用 [merlin](https://github.com/CSTR-Edinburgh/merlin) 进行训练. 具体见 [Mandarin_Voice](https://github.com/Jackiexiao/MTTS/tree/master/egs/mandarin_voice/s1)

## 上下文相关标注与问题集
* [Context related annotation上下文相关标注](https://github.com/Jackiexiao/MTTS/blob/master/misc/mandarin_label.md)
* [Question Set问题集](https://github.com/Jackiexiao/MTTS/blob/master/misc/questions-mandarin.hed)
* [Rules to design a Question Set问题集设计规则](https://github.com/Jackiexiao/MTTS/blob/master/docs/mddocs/question.md)


## 安装
Python : python3.6
System: linux(tested on ubuntu16.04)
```
pip install jieba pypinyin
sudo apt-get install libatlas3-base
```
自行下载下面的文件或者run `bash tools/install_mtts.sh`
Download [montreal-forced-aligner](https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/releases/download/v1.0.0/montreal-forced-aligner_linux.tar.gz) and unzip to directory tools/
Download acoustic_model
[thchs30.zip](https://github.com/Jackiexiao/MTTS/releases/download/v0.1/thchs30.zip) and copy to directory misc/

**测试demo**
```
bash run_demo.sh
```

## 使用方法
### 1. 使用音频文本生成HTS Label
* Usage: Enter dir `MTTS/src` Run `python mtts.py txtfile wav_directory_path output_directory_path` (Absolute path or relative path) Then you will get HTS label
* 注意:只能含有中文文本,不能有阿拉伯数字或者英文字母

**txtfile example**
```
A_01 这是一段文本
A_02 这是第二段文本
```
**wav_directory example**采样率应大于16khz
```
--A_01.wav
--A_02.wav
```

### 2. 使用音频以及对齐好文本和音频的标注文件 生成 HTS Label
具体使用方法见源代码
[mandarin_frontend.py](https://github.com/Jackiexiao/MTTS/blob/master/src/mandarin_frontend.py)

### 3. 使用脚本 egs/mandarin_voice
Copy `MTTS/egs/mandarin_voice` to merlin's according directory
然后看里面的README

### 4. Forced-alignment 音频文本对齐
This project use [Montreal-Forced-Aligner](https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner) to do forced alignment
1. We trained the acoustic model using thchs30 dataset, see `misc/thchs30.zip`, the dictionary we use [mandarin_mtts.lexicon](https://github.com/Jackiexiao/MTTS/blob/master/misc/mandarin_mtts.lexicon)
2. If you want to use mfa's (montreal-forced-aligner) pre-trained mandarin model, this is the dictionary you need [mandarin-for-montreal-forced-aligner-pre-trained-model.lexicon](https://github.com/Jackiexiao/MTTS/blob/master/misc/mandarin-for-montreal-forced-aligner-pre-trained-model.lexicon)

## 韵律标注
没有韵律标注也可以生成Label

代码中#0表示词语的边界,#1表示韵律词,#2表示重音,#3表示韵律短语,#4表示语调短语。本项目规定词语比韵律词小,代码里自动进行了调整。当不输入韵律时也能够生成可用的label,不过合成的语音韵律感不强

## 贡献者
* Jackiexiao
* willian56

94 changes: 94 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# MTTS Mandarin/Chinese Text to Speech FrontEnd

[中文README](https://github.com/Jackiexiao/MTTS/blob/master/README-zh.md)

**ON_DEVELOPMENT**

Mandarin/Chinese Text to Speech based on statistical parametric speech
synthesis using merlin toolkit

Read the document (write in Chinese) at [MTTS Document](http://mtts.readthedocs.io/zh_CN/latest/#)

## Data
Using 15 hours of wav for a mandarin speech synthesis dataset which is not
open-source, but you can use thchs30 dataset to run the demo (or record wav by
yourself)

## Generated Samples
Using Training Sets Label to generate wav https://jackiexiao.github.io/MTTS/

I also use thchs30 dataset to train (only using 250 wavs for A11 speaker), see
the website above

## How To Reproduce
1. First, you need data contain wav and txt (prosody mark is optional)
2. Second, generate HTS label using this project
3. Using [merlin](https://github.com/CSTR-Edinburgh/merlin) to train. Specific method see [Mandarin_Voice](https://github.com/Jackiexiao/MTTS/tree/master/egs/mandarin_voice/s1)

## Context related annotation & Question Set
* [Context related annotation上下文相关标注](https://github.com/Jackiexiao/MTTS/blob/master/misc/mandarin_label.md)
* [Question Set问题集](https://github.com/Jackiexiao/MTTS/blob/master/misc/questions-mandarin.hed)
* [Rules to design a Question Set问题集设计规则](https://github.com/Jackiexiao/MTTS/blob/master/docs/mddocs/question.md)

## Install
Python : python3.6
System: linux(tested on ubuntu16.04)
```
pip install jieba pypinyin
sudo apt-get install libatlas3-base
```
Download file by yourself or run `bash tools/install_mtts.sh`
Download [montreal-forced-aligner](https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/releases/download/v1.0.0/montreal-forced-aligner_linux.tar.gz) and unzip to directory tools/
Download acoustic_model
[thchs30.zip](https://github.com/Jackiexiao/MTTS/releases/download/v0.1/thchs30.zip) and copy to directory misc/

**Run Demo**
```
bash run_demo.sh
```
## Usage
### 1. Generate HTS Label by wav and text
* Usage: Enter dir `MTTS/src` Run `python mtts.py txtfile wav_directory_path output_directory_path` (Absolute path or relative path) Then you will get HTS label
* Attention: Currently only support Chinese Character, txt should not have any
Arabia number or English alphabet

**txtfile example**
```
A_01 这是一段文本
A_02 这是第二段文本
```
**wav_directory example**(Sampleing Rate should larger than 16khz)
```
--A_01.wav
--A_02.wav
```

### 2. Generate Label by wav and alignment file
see source code for more information
[mandarin_frontend.py](https://github.com/Jackiexiao/MTTS/blob/master/src/mandarin_frontend.py)

### 3. Using egs/mandarin_voice script
Copy `MTTS/egs/mandarin_voice` to merlin's according directory, and see README [Mandarin_Voice](https://github.com/Jackiexiao/MTTS/blob/master/egs/mandarin_voice/s1/README.md)

### 4. Forced-alignment
This project use [Montreal-Forced-Aligner](https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner) to do forced alignment
1. We trained the acoustic model using thchs30 dataset, see `misc/thchs30.zip`, the dictionary we use [mandarin_mtts.lexicon](https://github.com/Jackiexiao/MTTS/blob/master/misc/mandarin_mtts.lexicon)
2. If you want to use mfa's (montreal-forced-aligner) pre-trained mandarin model, this is the dictionary you need [mandarin-for-montreal-forced-aligner-pre-trained-model.lexicon](https://github.com/Jackiexiao/MTTS/blob/master/misc/mandarin-for-montreal-forced-aligner-pre-trained-model.lexicon)

## Prosody Mark
You can generate HTS Label without prosody mark. we assume that word segment is
smaller than prosodic word(which is adjusted in code)

"#0","#1", "#2","#3" and "#4" are the prosody labeling symbols.
* #0 stands for word segment
* #1 stands for prosodic word
* #2 stands for stressful word (actually in this project we regrad it as #1)
* #3 stands for prosodic phrase
* #4 stands for intonational phrase

Improvement of prosody analyse will come soon

## Contributor
* Jackiexiao
* willian56

20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = sphinx-build
SPHINXPROJ = MTTS
SOURCEDIR = .
BUILDDIR = _build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
Loading

0 comments on commit 387ee6b

Please sign in to comment.