rebuild git

Jackiexiao · Mar 3, 2018 · 387ee6b · 387ee6b
commit 387ee6b
Show file tree

Hide file tree

Showing 79 changed files with 11,378 additions and 0 deletions.
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,13 @@
+*.pyc
+data/*
+__pycache__
+src/generate.py
+forced_alignment/*
+utils/*
+.ipynb_checkpoints/
+tools/montreal-forced-aligner_linux.tar.gz
+tools/montreal-forced-aligner
+misc/thchs30.zip
+docs/_build/*
+thchs30_250_demo.tar.gz
+docs/.vscode
diff --git a/LICENSE b/LICENSE
@@ -0,0 +1,22 @@
+MIT License
+
+Copyright (c) 2017-2018 Jackiexiao <[email protected]>
+
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+"Software"), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
+LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
diff --git a/README-zh.md b/README-zh.md
@@ -0,0 +1,87 @@
+# MTTS Mandarin/Chinese Text to Speech FrontEnd
+
+[English README](https://github.com/Jackiexiao/MTTS/blob/master/README.md)  
+中文版本不保证与英文版本同步
+
+**ON_DEVELOPMENT**
+
+Mandarin/Chinese Text to Speech based on statistical parametric speech 
+synthesis using merlin toolkit
+
+文档 [MTTS Document](http://mtts.readthedocs.io/zh_CN/latest/#)  
+
+## 使用的数据
+使用了15个小时的音频，但不是开源的音频。你可以使用thchs30的数据来做测试，或者自己录音
+
+## 生成音频样例
+使用了训练集内的Label生成语音 https://jackiexiao.github.io/MTTS/
+
+我也用thchs30中A11发音人的250句语音训练并合成了音频样例，见上面的网站
+
+## 如何复现
+
+1. 首先你需要语料库（包含音频文本，韵律标注可以不要）
+2. 然后通过这个项目生成HTS Label
+3. 使用 [merlin](https://github.com/CSTR-Edinburgh/merlin) 进行训练. 具体见 [Mandarin_Voice](https://github.com/Jackiexiao/MTTS/tree/master/egs/mandarin_voice/s1)
+
+## 上下文相关标注与问题集
+* [Context related annotation上下文相关标注](https://github.com/Jackiexiao/MTTS/blob/master/misc/mandarin_label.md)
+* [Question Set问题集](https://github.com/Jackiexiao/MTTS/blob/master/misc/questions-mandarin.hed)
+* [Rules to design a Question Set问题集设计规则](https://github.com/Jackiexiao/MTTS/blob/master/docs/mddocs/question.md)
+
+
+## 安装
+Python : python3.6  
+System: linux(tested on ubuntu16.04)  
+```
+pip install jieba pypinyin
+sudo apt-get install libatlas3-base
+```
+自行下载下面的文件或者run `bash tools/install_mtts.sh`  
+Download [montreal-forced-aligner](https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/releases/download/v1.0.0/montreal-forced-aligner_linux.tar.gz) and unzip to directory tools/  
+Download acoustic_model
+[thchs30.zip](https://github.com/Jackiexiao/MTTS/releases/download/v0.1/thchs30.zip) and copy to directory misc/  
+
+**测试demo**
+```
+bash run_demo.sh
+```
+
+## 使用方法
+### 1. 使用音频文本生成HTS Label
+* Usage: Enter dir `MTTS/src` Run `python mtts.py txtfile wav_directory_path output_directory_path` (Absolute path or relative path) Then you will get HTS label
+* 注意：只能含有中文文本，不能有阿拉伯数字或者英文字母
+
+**txtfile example**
+```
+A_01 这是一段文本
+A_02 这是第二段文本
+```
+**wav_directory example**采样率应大于16khz
+```
+--A_01.wav  
+--A_02.wav  
+```
+
+### 2. 使用音频以及对齐好文本和音频的标注文件 生成 HTS Label
+具体使用方法见源代码
+[mandarin_frontend.py](https://github.com/Jackiexiao/MTTS/blob/master/src/mandarin_frontend.py)
+
+### 3. 使用脚本 egs/mandarin_voice 
+Copy `MTTS/egs/mandarin_voice` to merlin's according directory
+然后看里面的README
+
+### 4. Forced-alignment 音频文本对齐
+This project use [Montreal-Forced-Aligner](https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner) to do forced alignment
+1. We trained the acoustic model using thchs30 dataset, see `misc/thchs30.zip`, the dictionary we use [mandarin_mtts.lexicon](https://github.com/Jackiexiao/MTTS/blob/master/misc/mandarin_mtts.lexicon)
+2. If you want to use mfa's (montreal-forced-aligner) pre-trained mandarin model, this is the dictionary you need [mandarin-for-montreal-forced-aligner-pre-trained-model.lexicon](https://github.com/Jackiexiao/MTTS/blob/master/misc/mandarin-for-montreal-forced-aligner-pre-trained-model.lexicon)
+
+## 韵律标注
+没有韵律标注也可以生成Label
+
+代码中#0表示词语的边界，#1表示韵律词，#2表示重音，#3表示韵律短语，#4表示语调短语。本项目规定词语比韵律词小，代码里自动进行了调整。当不输入韵律时也能够生成可用的label，不过合成的语音韵律感不强
+
+## 贡献者
+* Jackiexiao
+* willian56
+
diff --git a/README.md b/README.md
@@ -0,0 +1,94 @@
+# MTTS Mandarin/Chinese Text to Speech FrontEnd
+
+[中文README](https://github.com/Jackiexiao/MTTS/blob/master/README-zh.md)
+
+**ON_DEVELOPMENT**
+
+Mandarin/Chinese Text to Speech based on statistical parametric speech 
+synthesis using merlin toolkit
+
+Read the document (write in Chinese) at [MTTS Document](http://mtts.readthedocs.io/zh_CN/latest/#)
+
+## Data
+Using 15 hours of wav for a mandarin speech synthesis dataset which is not
+open-source, but you can use thchs30 dataset to run the demo (or record wav by
+yourself)
+
+## Generated Samples
+Using Training Sets Label to generate wav https://jackiexiao.github.io/MTTS/
+
+I also use thchs30 dataset to train (only using 250 wavs for A11 speaker), see
+the website above
+
+## How To Reproduce
+1. First, you need data contain wav and txt (prosody mark is optional)
+2. Second, generate HTS label using this project 
+3. Using [merlin](https://github.com/CSTR-Edinburgh/merlin) to train. Specific method see [Mandarin_Voice](https://github.com/Jackiexiao/MTTS/tree/master/egs/mandarin_voice/s1)
+
+## Context related annotation & Question Set
+* [Context related annotation上下文相关标注](https://github.com/Jackiexiao/MTTS/blob/master/misc/mandarin_label.md)
+* [Question Set问题集](https://github.com/Jackiexiao/MTTS/blob/master/misc/questions-mandarin.hed)
+* [Rules to design a Question Set问题集设计规则](https://github.com/Jackiexiao/MTTS/blob/master/docs/mddocs/question.md)
+
+## Install
+Python : python3.6  
+System: linux(tested on ubuntu16.04)  
+```
+pip install jieba pypinyin
+sudo apt-get install libatlas3-base
+```
+Download file by yourself or run `bash tools/install_mtts.sh`  
+Download [montreal-forced-aligner](https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/releases/download/v1.0.0/montreal-forced-aligner_linux.tar.gz) and unzip to directory tools/  
+Download acoustic_model
+[thchs30.zip](https://github.com/Jackiexiao/MTTS/releases/download/v0.1/thchs30.zip) and copy to directory misc/  
+
+**Run Demo**
+```
+bash run_demo.sh
+```
+## Usage
+### 1. Generate HTS Label by wav and text
+* Usage: Enter dir `MTTS/src` Run `python mtts.py txtfile wav_directory_path output_directory_path` (Absolute path or relative path) Then you will get HTS label
+* Attention: Currently only support Chinese Character, txt should not have any
+    Arabia number or English alphabet
+
+**txtfile example**
+```
+A_01 这是一段文本
+A_02 这是第二段文本
+```
+**wav_directory example**(Sampleing Rate should larger than 16khz)
+```
+--A_01.wav  
+--A_02.wav  
+```
+
+### 2. Generate Label by wav and alignment file
+see source code for more information 
+[mandarin_frontend.py](https://github.com/Jackiexiao/MTTS/blob/master/src/mandarin_frontend.py)
+
+### 3. Using egs/mandarin_voice script 
+Copy `MTTS/egs/mandarin_voice` to merlin's according directory, and see README [Mandarin_Voice](https://github.com/Jackiexiao/MTTS/blob/master/egs/mandarin_voice/s1/README.md)
+
+### 4. Forced-alignment
+This project use [Montreal-Forced-Aligner](https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner) to do forced alignment
+1. We trained the acoustic model using thchs30 dataset, see `misc/thchs30.zip`, the dictionary we use [mandarin_mtts.lexicon](https://github.com/Jackiexiao/MTTS/blob/master/misc/mandarin_mtts.lexicon)
+2. If you want to use mfa's (montreal-forced-aligner) pre-trained mandarin model, this is the dictionary you need [mandarin-for-montreal-forced-aligner-pre-trained-model.lexicon](https://github.com/Jackiexiao/MTTS/blob/master/misc/mandarin-for-montreal-forced-aligner-pre-trained-model.lexicon)
+
+## Prosody Mark
+You can generate HTS Label without prosody mark. we assume that word segment is
+smaller than prosodic word(which is adjusted in code)
+
+"#0","#1", "#2","#3" and "#4" are the prosody labeling symbols.
+* #0 stands for word segment
+* #1 stands for prosodic word
+* #2 stands for stressful word (actually in this project we regrad it as #1)
+* #3 stands for prosodic phrase
+* #4 stands for intonational phrase 
+
+Improvement of prosody analyse will come soon
+
+## Contributor
+* Jackiexiao
+* willian56
+
diff --git a/docs/Makefile b/docs/Makefile
@@ -0,0 +1,20 @@
+# Minimal makefile for Sphinx documentation
+#
+
+# You can set these variables from the command line.
+SPHINXOPTS    =
+SPHINXBUILD   = sphinx-build
+SPHINXPROJ    = MTTS
+SOURCEDIR     = .
+BUILDDIR      = _build
+
+# Put it first so that "make" without argument is like "make help".
+help:
+	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+
+.PHONY: help Makefile
+
+# Catch-all target: route all unknown targets to Sphinx using the new
+# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
+%: Makefile
+	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)