I just found this. Main page [https://k2-fsa.github.io/sherpa/onnx/index.html]
This is huge! As a german, I use thorsten medium
[https://huggingface.co/csukuangfj/sherpa-onnx-apk/resolve/main/tts-engine-new/1.10.26/sherpa-onnx-1.10.26-arm64-v8a-de-tts-engine-vits-piper-de_DE-thorsten-medium.apk]
as he simply made the best dataset. Mixing english with german, speaking
numbers, single letters, pausing without a “.” but just a linebreak, all those
can be essential. And… it is nearly perfect! And all local! This is crazy!
eSpeak can finally go to rest!
I mentioned espeak at least somewhere. It is 32bit and likely unmaintained. And the Android version is already better than what I had on Fedora KDE.
Yes, something like improved espeak would be fine and extremely efficient. But this works fine too.
Modern phones have NPUs for low power neural network tasks like those. Older ones hopefully use the GPU for that.
The apps are pretty flawed in that they have only one model.
And it is very impressive how espeak can do so much, so efficiently.
Its simply that nobody created a better voice model (and removed all the silly joke okes) for it, and ported the APK to modern Android with armv8a (64bit)
I mentioned espeak at least somewhere. It is 32bit and likely unmaintained. And the Android version is already better than what I had on Fedora KDE.
Yes, something like improved espeak would be fine and extremely efficient. But this works fine too.
Modern phones have NPUs for low power neural network tasks like those. Older ones hopefully use the GPU for that.
The apps are pretty flawed in that they have only one model.
And it is very impressive how espeak can do so much, so efficiently.
Its simply that nobody created a better voice model (and removed all the silly joke okes) for it, and ported the APK to modern Android with armv8a (64bit)