meSpeak – Voices & Languages

A short guide to the set-up of languages and voices for meSpeak.
Please mind that meSpeak is based on an Emscripten-port of eSpeak, so all of the eSpeak grammar applies also to meSpeak.

Standard Language Files

meSpeak's language-files provide eSpeak's language- and voice-files in a single package.
(Since a voice usually refers to a language and its dictionary, it seems suitable to bundle them together in a single file.)
The language-files are of the following structure (JSON):

{ "voice_id": "<filename>", "dict_id": "<filename>", "dict": "<base64-encoded octet stream>", "voice": "<base64-encoded octet stream>" }

The values of voice_id and dict_id are actually UNIX-filenames, dict_id relative to the path of eSpeak's data-directory "espeak-data/", voice_id relative to "espeak-data/voices/".

If we were to embed the files for the langage "en-en", these would be:

"en/en-en" for the voice and
"en_dict" for the dictionary used by "en-en"

For a standard language-file, you would add a base64-representation as the string value of dict and voice of the respective eSpeak-files.

Customizing

There is an alternate layout for meSpeak's language-files, which is espacially usefull for the purpose of customizing and testing:

{ "voice_id": "<filename>", "dict_id": "<filename>", "dict": "<base64-encoded octet stream>", "voice": "<text-string>", "voice_encoding": "text" }

Since eSpeak's voice-files are actually plain-text files, you may use a simple string for these, if you provide an additional property "voice_encoding": "text" at the same time.

For dictionaries, which are a binary files with eSpeak, see the note at the end of the page.

Example

For an example we will configure a basic female voice for "en-us", which will be named "en-us-f".

Make a copy of a meSpeak-language file (json), which you want to modify (in this case "voices/en/en-us.json).
Rename the file (e.g.: "en-us-f.json") and open it in editor.
Download the source of eSpeak and go to the "espeak-data/" directory.
The eSpeak-file "espeak-data/voices/en-us" looks like this:
Rename the "name" parameter to make it unique (e.g.: "name english-us-f").
Change any paramaters as you whish, in this case change "gender male" to "gender female" for a female voice.
You should have arrived at something like this (first line removed, since it is just a comment):
Replace any line-breaks by "\n" in order to get a valid JSON-string: And use this as a value for the "voice"-property of the JSON-file.
Add the line "voice_encoding": "text" to the JSON to indicate that the voice is plain-text.
Your voice file should now look like this:
Save it and load it into meSpeak.

Please note that eSpeak is not very graceful with syntax errors in a voice-definition and will just throw an error, which will — in the case of meSpeak — show up in the console-log.

For further details on voice-parameters and fine-tuning, please refer to the eSpeak-documentation: http://espeak.sourceforge.net/voices.html.

Custom Dictionaries

eSpeak's dictonaries are binary files, which must be compiled with eSpeak first.
You would have to install eSpeak and compile a file following the eSpeak documentation.
Further, you would insert a base64-encoded string of the resulting object-file's content as the value of the dict property of a meSpeak-language-file.
Finally, you would set a suiting and unique value for the property dict_id (UNIX file path).

There is no shortcut to this. Sorry.

Norbert Landsteiner
Vienna, July 2013