You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We would like to add vocabulary to our small Japanese model to be able to recognize specific names, but we are having trouble creating a successful custom dictionary.
First, I added the words I want to register to words.txt. (For example, for “nasas”, I added “nasas 400000”)
Then I added the vocabulary I want to add as text.txt. (For example, for “nasas”, I added “nasas n_B a_I s_I a_I s_I u_E”)
In phones.txt, we added “...,m_B 115,m_E 116,m_I 117,...” and is registered as “m_B 115,m_E 116,m_I 117,...”.
So, if I run “farcompilestrings --fst_type=compact --symbols=words.txt --keep_symbols text.txt > text.far” in WSL,
ERROR: ConvertSymbolToLabel: Symbol “n_B” is not mapped to any integer label, symbol table = words.txt
FATAL: FarCompileStrings: Compiling string number 1 in file text.txt failed with token_type = symbol and entry_type = line”
is displayed.
Am I doing something wrong to begin with? Can you please tell me how best to add vocabulary to the Japanese model?
If I want to add a specific person's name that is not in the language model, do I need to rebuild the model?
Please let us know.
Thank you in advance.
(This message is using DeepL translation)
The text was updated successfully, but these errors were encountered:
We would like to add vocabulary to our small Japanese model to be able to recognize specific names, but we are having trouble creating a successful custom dictionary.
First, I added the words I want to register to words.txt. (For example, for “nasas”, I added “nasas 400000”)
Then I added the vocabulary I want to add as text.txt. (For example, for “nasas”, I added “nasas n_B a_I s_I a_I s_I u_E”)
In phones.txt, we added “...,m_B 115,m_E 116,m_I 117,...” and is registered as “m_B 115,m_E 116,m_I 117,...”.
So, if I run “farcompilestrings --fst_type=compact --symbols=words.txt --keep_symbols text.txt > text.far” in WSL,
ERROR: ConvertSymbolToLabel: Symbol “n_B” is not mapped to any integer label, symbol table = words.txt
FATAL: FarCompileStrings: Compiling string number 1 in file text.txt failed with token_type = symbol and entry_type = line”
is displayed.
Am I doing something wrong to begin with? Can you please tell me how best to add vocabulary to the Japanese model?
If I want to add a specific person's name that is not in the language model, do I need to rebuild the model?
Please let us know.
Thank you in advance.
(This message is using DeepL translation)
The text was updated successfully, but these errors were encountered: