Synthesize over 1,500 characters with AWS Polly?

My idea was to use the AWS Pollyfeed news to read aloud RSS. By this link, I understand that Polly is very flexible in terms of characters that need to be converted, as one example: "Adventures of Huckelberry Finn" by Mark Twain ~600k characters The problem is that when I try to convert my articles to speech, I get the following error:

An error occurred (TextLengthExceededException) when calling the SynthesizeSpeech operation: Maximum text length has been exceeded

The text I was trying to convert was about 5,000 characters.

Is there a way (with or without the API) to convert long lines of text using Polly without having to cut them into millions of different parts?

Any tip in the right direction will be appreciated.

thank

+4
source share
5 answers

The size of the input text can be up to 1500 characters set (3000 common characters). SSML tags are not considered declared characters.

http://docs.aws.amazon.com/polly/latest/dg/limits.html

examples of pricing are apparently intended to give an idea of ​​the relatively low cost of scoring a big job, but in fact the work should be divided into groups of sentences and presented in the API, which is the only interface - SDK and CLI call the same SynthesizeSpeechAPI .

+2
source

I don’t have a special tip without breaking the text apart, but I wrote an article with a way to do it in NodeJS. If you do not have another alternative, feel free to view and comment on it!

1500 AWS Polly text-to-speech

+1

, . - .

, AWS Polly 1500 . javascript, 230 , API , mp3 , .

: https://github.com/Aaronbest94/Polly-Character-Limitations

Javascript, , , , .

+1

, : https://docs.aws.amazon.com/polly/latest/dg/longer-cli.html.

aws-CLI :

aws polly start-speech-synthesis-task \
--region eu-central-1 \
--endpoint-url "https://polly.eu-central-1.amazonaws.com/" \
--output-format mp3 \
--output-s3-bucket-name your-bucket-name \
--output-s3-key-prefix optional/prefix/path/file \
--voice-id Hans \
--text-type ssml \
--text file://output.xml \
--speech-mark-types='["sentence", "word", "ssml"]' \

, S3- () .

0
source

I have a good explanation for this. https://blog.pixiebytez.com/2019/04/speaking-texts.html This will translate the converted voice to S3. Thus, you can convert more than the specified limit. Also, just increase the wait time if your file is large enough. And the good thing is that you can put this script in a loop to convert multiple files when renaming the buckets object.

0
source

Source: https://habr.com/ru/post/1664803/


All Articles