Use German vocabulary and language model with Sphinx4

I can use the en-us things that come with Sphinx4, no problem:

cfg.setAcousticModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us") cfg.setDictionaryPath("resource:/edu/cmu/sphinx/models/en-us/cmudict-en-us.dict") cfg.setLanguageModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us.lm.bin") 

I can use this to transcribe the recording of an audio file in English.

Now I want to use this with German records. On the website I find a link to Acoustic and language models . It has an archive of "German Voxforge". I will find the appropriate files for the acoustic model . As far as I know, it does not contain a dictionary or language model.

How do I get the dictionary and language model for German in Sphinx4?

+5
source share
1 answer

You create them yourself. You can create a language model from subtitles or dumps on Wikipedia. The documentation is here .

The latest German models are not actually on the CMUSphinx page, they are in github / gooofy . In this gooofy project you can find vocabulary documentation, models and related materials.

+4
source

Source: https://habr.com/ru/post/1243430/


All Articles