Real-time transcription of Google Cloud Speech API with Electron gRPC

I want to achieve the same real-time transcription process as the Web Speech API, but using the Google Cloud Speech API.

The main goal is to record live recording through the Electron application using the Speech API using the gRPC protocol.

This is a simplified version of what I implemented:

const { desktopCapturer } = window.require('electron');
const speech = require('@google-cloud/speech');

const client = speech.v1({
    projectId: 'my_project_id',
    credentials: {
        client_email: 'my_client_email',
        private_key: 'my_private_key',
    },
});

desktopCapturer.getSources(
    { types: ['window', 'screen'] },
    (error, sources) => {
        navigator.mediaDevices
            .getUserMedia({
                audio: true,
            })
            .then((stream) => {
                let fileReader = new FileReader();
                let arrayBuffer;
                fileReader.onloadend = () => {
                    arrayBuffer = fileReader.result;
                    let speechStreaming = client.streamingRecognize({
                        config: {
                           encoding: speech.v1.types.RecognitionConfig.AudioEncoding.LINEAR16,
                           languageCode: 'en-US',
                           sampleRateHertz: 44100,
                        },
                        singleUtterance: true,
                    }).on('data', (response) => response);

                    speechStreaming.write(arrayBuffer);
                }

                fileReader.readAsArrayBuffer(stream);
            })
    }
);

The answer to the error from the Speech API is that the audio stream is too slow and we are not sending it in real time.

I feel that the reason is that I transferred the stream without any formatting or initialization of the object, so streaming recognition cannot be performed.

+7
source share
1 answer

Github , : https://github.com/googleapis/nodejs-speech/blob/master/samples/infiniteStreaming.js

, streamingRecognize Google Cloud Speech API.

Electron, OtterAI. ( , )

0

Source: https://habr.com/ru/post/1688535/


All Articles