Use German vocabulary and language model with Sphinx4

Question

Use German vocabulary and language model with Sphinx4

I can use the en-us things that come with Sphinx4, no problem:

cfg.setAcousticModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us") cfg.setDictionaryPath("resource:/edu/cmu/sphinx/models/en-us/cmudict-en-us.dict") cfg.setLanguageModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us.lm.bin")

I can use this to transcribe the recording of an audio file in English.

Now I want to use this with German records. On the website I find a link to Acoustic and language models . It has an archive of "German Voxforge". I will find the appropriate files for the acoustic model . As far as I know, it does not contain a dictionary or language model.

How do I get the dictionary and language model for German in Sphinx4?

+5

cmusphinx sphinx4

0__ Feb 19 '16 at 20:38

source share

1 answer

Nikolay Shmyrev · Accepted Answer · 2016-02-19T21:53:59+0000

You create them yourself. You can create a language model from subtitles or dumps on Wikipedia. The documentation is here .

The latest German models are not actually on the CMUSphinx page, they are in github / gooofy . In this gooofy project you can find vocabulary documentation, models and related materials.

Use German vocabulary and language model with Sphinx4

More articles: