open_jtalk_elixir

Use Open JTalk from Elixir. This package builds a local open_jtalk CLI and, by default, bundles a UTF-8 dictionary and an HTS voice (you can disable this), exposing convenient functions:

OpenJTalk.say/2 — synthesize and play via a system audio player
OpenJTalk.to_wav_file/2 — synthesize text to a WAV file
OpenJTalk.to_wav_binary/2 — synthesize and return WAV bytes
OpenJTalk.Wav.concat_binaries/1 — merge multiple WAV binaries (same format)
OpenJTalk.Wav.concat_files/1 — merge multiple WAV files from paths (same format)

Install

Add the dependency to your mix.exs:

def deps do [ {:open_jtalk_elixir, "~> 0.3"} ] end

Then:

mix deps.get mix compile

On first compile the project may download and build MeCab, HTS Engine API, and Open JTalk. By default it also downloads and bundles a UTF-8 dictionary and a Mei voice into priv/ (you can turn this off with OPENJTALK_BUNDLE_ASSETS=0).

Build requirements

You’ll need common build tools: gcc/g++, make, curl, tar, unzip. On macOS Xcode Command Line Tools are sufficient.

Optional environment flags (honored by the Makefile):

OPENJTALK_FULL_STATIC=1 — attempt a fully static open_jtalk (Linux only; requires static libstdc++)
OPENJTALK_BUNDLE_ASSETS=0|1 — whether to bundle dictionary/voice into priv/

Tested platforms

Host builds (compile and run on the same machine):

Linux x86_64
Linux aarch64
macOS 14 (arm64, Apple Silicon)

Cross-compile (host → target):

Linux x86_64 → Nerves rpi4 (aarch64)

Quick start

# play via system audio player (aplay/paplay/afplay/play) OpenJTalk.say("元氣ですかあ 、元氣が有れば、なんでもできる")

Options

All synthesis calls accept the same options (values are clamped):

:timbre — voice color offset -0.8..0.8 (default 0.0)
:pitch_shift — semitones -24..24 (default 0)
:rate — speaking speed 0.5..2.0 (default 1.0)
:gain — output gain in dB (default 0)
:voice — path to a .htsvoice file (optional)
:dictionary — path to a directory containing sys.dic (optional)
:timeout — max runtime in ms (default 20_000)
:out — output WAV path (only for to_wav_file/2)

Concatenate WAVs

You can combine multiple WAVs (same format: channels/rate/bit depth/etc.) into one:

{:ok, a} = OpenJTalk.to_wav_binary("これは一つ目。") {:ok, b} = OpenJTalk.to_wav_binary("これは二つ目。") {:ok, c} = OpenJTalk.to_wav_binary("これは三つ目。") {:ok, merged} = OpenJTalk.Wav.concat_binaries([a, b, c]) # or from files: # {:ok, merged} = OpenJTalk.Wav.concat_files(["a.wav", "b.wav", "c.wav"])

How asset resolution works

The package resolves required assets in this order:

Environment variable override
Bundled asset in priv/
System-installed location

CLI binary (`open_jtalk`)

Env: OPENJTALK_CLI — full path to open_jtalk.
Bundled: priv/bin/open_jtalk (built during compile).
System: open_jtalk found on $PATH.

Dictionary (`sys.dic`)

Env: OPENJTALK_DICTIONARY_DIR — directory containing sys.dic.
Bundled: priv/dictionary/sys.dic or any priv/dictionary/**/sys.dic (e.g. naist-jdic).
System: common locations such as /var/lib/mecab/dic/open-jtalk/naist-jdic, /usr/lib/*/mecab/dic/open-jtalk/naist-jdic, etc.

Voice (`.htsvoice`)

Env: OPENJTALK_VOICE — path to a .htsvoice file.
Bundled: first file matching priv/voices/**/*.htsvoice.
System: standard locations like /usr/share/hts-voice/** or /usr/local/share/hts-voice/**.

If you change environment variables at runtime (or move files), refresh the cached paths:

OpenJTalk.Assets.reset_cache()

Using with Nerves

This library is Nerves-aware. When MIX_TARGET is set the build defaults to:

OPENJTALK_FULL_STATIC=1 — try to statically link the CLI on Linux targets when possible
OPENJTALK_BUNDLE_ASSETS=1 — bundle CLI, dictionary, and voice into priv/

So for many projects no extra configuration is needed.

Quick Nerves flow

export MIX_TARGET=rpi4 mix deps.get mix compile mix firmware

On the device:

{:ok, info} = OpenJTalk.info() # bundled assets should show up as :bundled OpenJTalk.say("こんにちは")

Audio on Nerves

OpenJTalk.say/2 requires a system audio player. Most Nerves images use ALSA aplay. If your image does not include a player:

add one to the system image, or
use OpenJTalk.to_wav_file/2 and play the WAV with your chosen mechanism.

Firmware size notes

Bundling the full dictionary + voice + binary increases firmware size. Approximate (uncompressed) sizes:

Dictionary: ~103 MB
Mei voice: ~2.2 MB
CLI binary: ~2.4 MB

If that’s too large you can avoid bundling at compile time and provision assets separately (rootfs overlay, /data, OTA, etc.):

MIX_TARGET=rpi4 OPENJTALK_BUNDLE_ASSETS=0 mix deps.compile open_jtalk_elixir

Then point the library to the provisioned assets (for example in config/runtime.exs):

System.put_env("OPENJTALK_CLI", "/data/open_jtalk/bin/open_jtalk") System.put_env("OPENJTALK_DICTIONARY_DIR", "/data/open_jtalk/dic") System.put_env("OPENJTALK_VOICE", "/data/open_jtalk/voices/mei_normal.htsvoice") OpenJTalk.Assets.reset_cache()

How you provision those files into your image is outside the scope of this library.

Third-party components and licenses

This package does not redistribute third-party assets by default. At compile time it may download and build the following components:

Open JTalk 1.11
- License: Modified BSD (3-Clause)
- Source: http://open-jtalk.sourceforge.net/
HTS Engine API 1.10
- License: Modified BSD (3-Clause)
- Source: http://hts-engine.sourceforge.net/
MeCab 0.996
- License: Tri-licensed (GPL / LGPL / BSD); used under BSD terms
- Source: https://taku910.github.io/mecab/
Open JTalk Dictionary (NAIST-JDIC UTF-8) 1.11
- License: BSD-style by NAIST
- Source: https://sourceforge.net/projects/open-jtalk/files/Dictionary/
HTS Voice “Mei” (MMDAgent_Example 1.8)
- License: CC BY 3.0
- Source: https://sourceforge.net/projects/mmdagent/files/MMDAgent_Example/
- Attribution: “HTS Voice ‘Mei’ © Nagoya Institute of Technology, licensed CC BY 3.0.”

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.github/workflows		.github/workflows
lib		lib
notebooks		notebooks
scripts		scripts
test		test
vendor		vendor
.credo.exs		.credo.exs
.formatter.exs		.formatter.exs
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
mix.exs		mix.exs
mix.lock		mix.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

open_jtalk_elixir

Install

Build requirements

Tested platforms

Quick start

Options

Concatenate WAVs

How asset resolution works

CLI binary (`open_jtalk`)

Dictionary (`sys.dic`)

Voice (`.htsvoice`)

Using with Nerves

Quick Nerves flow

Audio on Nerves

Firmware size notes

Third-party components and licenses

About

Uh oh!

Releases 5

Languages

License

piyopiyoex/open_jtalk_elixir

Folders and files

Latest commit

History

Repository files navigation

open_jtalk_elixir

Install

Build requirements

Tested platforms

Quick start

Options

Concatenate WAVs

How asset resolution works

CLI binary (open_jtalk)

Dictionary (sys.dic)

Voice (.htsvoice)

Using with Nerves

Quick Nerves flow

Audio on Nerves

Firmware size notes

Third-party components and licenses

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Languages

CLI binary (`open_jtalk`)

Dictionary (`sys.dic`)

Voice (`.htsvoice`)