Java Review: Text to Speech Review

Now I am looking for a Java Text to Speech (TTS) framework. During my research, I found several JSAPI1.0- (partially) -compatible frameworks listed on Mary's page, Say-It-Now ). I also noted that there is currently no reference implementation for JSAPI.

Brief tests that I did for FreeTTS (the first of them on the JSAPI impls page) show that it is far from easy to read simple and obvious words (examples: ABC, whiteboard). Other tests are currently underway.

And here is the question (6, actually):

  • Which Java-based TTS platform did you use?
  • Which, in your opinion, are capable of reading the largest vocabulary?
  • How about the quality of their voice?
  • How about their performance?
  • What non-Java Java binding frameworks are on the scene?
  • Which of them would you recommend?

Thanks in advance for your comments and suggestions.

+45
java text-to-speech
Sep 27 '08 at 10:43
source share
9 answers

I'm really lucky with FreeTTS

+18
Sep 27 '08 at 11:36
source share
— -
+14
Sep 13 '13 at 15:36
source share

I used Mary before, and I was very impressed with the quality of the voices. Unfortunately, I have not used any of the others.

+6
Sep 27 '08 at 10:58
source share

I used AT & T Natural Voices , which provides JSAPI and MS SAPI bindings. It provides excellent quality voices, a good "general" vocabulary of speech, many pronunciation controls and several languages. It is a bit expensive but works very well.

I used it to read important sensory telemetry for drivers in a mobile sensor application. We had no complaints about voice quality. He had about 75% finished accuracy with scientific terms and much higher (maybe 90% +) with normal dialogue. We got up to 99 +% accuracy using markup (most of the errors were in scientific terms with unusual combinations of phonemes).

It was a bit heavy for the processor (we were working on an equivalent Pentium-III machine, and this gave 50% -75% of the peak processor). It uses a built-in speech engine (compatible with Windows, Linux and Mac) with a Java interface.

There are a huge number of voices and languages ​​...

+4
Sep 29 '08 at 19:30
source share

In fact, there is not much choice:

  • The festival is the oldest. Written in C ++, but has bindings to Java.
  • eSpeak, qucik and the simple one used by Google Translate
  • MBROLA

Pure Java:

  • FreeTTS, whose code was ported from the festival and then open source, was stopped.
  • MaryTTS is more powerful and ready for production.

There are also other proprietary programs, such as:

  • Acapella
  • Nuance Vocalizer

If your software is Windows only, you can use the Microsoft Speech API.

+3
Dec 25 '14 at 2:55
source share

Thanks a lot to everyone, the FreeTTS source trick. In short: if it runs as java -jar freetts.jar some-more-args-here , it pronounces fewer words than when executed as bin / Server.jar and bin / Client.jar.

+1
Sep 29 '08 at 9:28
source share

I used FreeTTS, but I had a big problem getting MBrola voices to run on My MacbookPro. I got MBrola votes for working on Windows (painfully) and Linux. I was not lucky to download any other voice packs on FreeTTS, which is a shame because the voices delivered are terrible IMO. Beyond this, I had little success with Cloudgarden, but it only works on Windows AFAIK. I would be interested to hear other successes / failures in voice machines, as this type of work is especially difficult. I am also a little versed in Sphinx4. I just pulled down JVXML (which seemed to be based on Sphinx4) last night, but couldn't get it to work for some strange reason.

+1
Apr 10 '09 at 13:32
source share

I helped Mary. I feel that it has the potential if someone smarter than me separates the HMM voices from the kernel (these voices do not need large datasets and sound is fine). I am also trying to make an event system for freetts to dispatch events when it pronounces a word. I had success, but now it is broken in Linux. (probably due to a timer error).

+1
Feb 27 '10 at
source share

I did not like MarryTTS. It has a multilingual and understandable voice for understanding.

T convert speech to text, the best option is sphinx4-5prealpha . I give one thumb because it has a customizable, flexible, and mutable recognizer and grammar.

0
Aug 08 '17 at 12:21
source share



All Articles