Skip to content

Conversation

nyasu3w
Copy link

@nyasu3w nyasu3w commented Jan 26, 2025

This merge request adds llm_exttts module to use open_jtalk for speaking Japanese.
As prerequisites, the following additional ubuntu packages are necessary on the product prepared environment.

  • open-jtalk
  • open-jtalk-mecab-naist-jdic
  • hts-voice-nitech-jp-atr503-m001 sox
  • sox

I hope that these are to be added by deb package dependency.

The setup json needs in "data",

  • model (but not used)
  • cmdtype ("open_jtalk" is only available for now)
  • response_format and enoutput (same as melotts)

This module is the final module of a line. The generated voices are directly played not using llm-audio module.

@nyasu3w
Copy link
Author

nyasu3w commented Jan 26, 2025

The systemd service file needs to set environment variable for this module.
If it is missing, the kicked commands will fail.
The almost part of the service file is as same as the other modules.

[Service]
Environment = LD_LIBRARY_PATH=/usr/local/lib:/usr/lib:/opt/lib:/opt/usr/lib:/soc/lib

@nyasu3w
Copy link
Author

nyasu3w commented Feb 5, 2025

On my project, llm_exttts had been updated to support "speed" and "volume" settings and stop/resume method.
If you look this PR as worth accepted, I will make a following PR to import them, or create a new PR including this.

@Abandon-ht
Copy link
Collaborator

Thank you very much for your PR, I have tested it and there is no problem. Please update if you have new commits. I'll merge it after you're done.

@nyasu3w
Copy link
Author

nyasu3w commented Feb 6, 2025

Thank you, too. It is a very good new for me.
I added a new change to the PR. Please check.

In this PR, I tried to add a new function to stop/resume, but I am not sure if it is a correct way. I would like to have information how to add a new RPC entry to modules.

@dianjixz
Copy link
Contributor

dianjixz commented Feb 7, 2025

I personally think that the name "llm-exttts" is not very elegant. Perhaps it needs a more descriptive name. What do you think? @nyasu3w

@nyasu3w
Copy link
Author

nyasu3w commented Feb 7, 2025

The original idea is shortened from "EXTernal TTS", which can use external commands like open-jtalk and so on.
For now it uses only open-jtalk, so the name is too big and it does not indicate its feature directly.
I would like to suggest llm-openjtalk for its name. How about this?

@dianjixz
Copy link
Contributor

dianjixz commented Feb 8, 2025

"llm-openjtalk" This name is nice, I agree. I looked at your submission, currently using a command to call an existing executable file. If possible, I still hope to be able to directly call the OpenJTalk API in C++, which would be more consistent for this project.

@dianjixz
Copy link
Contributor

dianjixz commented Feb 8, 2025

In terms of the project's architecture, TTS is a part of the entire workflow and is subject to wake-up control by the keyword spotting (KWS) unit.

@nyasu3w
Copy link
Author

nyasu3w commented Feb 8, 2025

This time I would like to withdraw this pull request until my implement fits wiht the project's architecture.
From my side, the information of StackFlow is not given enough to understand the architecture. I expect the materials/documents are given in the near future!

@nyasu3w nyasu3w closed this Feb 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants