Record / save audio from the mouth for speech recognition

Before asking this question, I checked all the stack stack flows associated with this problem without any success, so please do not respond with links to other threads:

I want to save / record audio that the Google recognition service used for speech in text mode (using RecognizerIntent or SpeechRecognizer).

I had a lot of ideas:

  • onBufferReceived from RecognitionListener: I know that this does not work, just check it to see what happens and onBufferReceived is never called (tested on galactic Nexus with JB 4.3).
  • used a media recorder: does not work. This is a violation of speech recognition. only one microphone operation allowed
  • tried to find where the recognition service saves a temporary audio file before speaking in the api text to copy it, but without success

I was almost desperate, but I just noticed that the Google Keep app is doing what I need to do !!!! I debugged the keep application using logcat a bit, and the application also calls "RecognizerIntent.ACTION_RECOGNIZE_SPEECH" (as we developers do) to cause speech in the text. but how to save sound storage? could it be hide the api? is it a Google scam?)?

thanks for the help

Regards

+18
android speech-recognition speech-to-text
Apr 13 '14 at 19:31
source share
3 answers
Answer to

@Kaarel is almost complete - the resulting sound is in intent.getData() and can be read using ContentResolver

Unfortunately, the returned AMR file is of poor quality - I could not find a way to get a high quality recording. Any value I tried other than "audio / AMR" returned null in intent.getData() .

If you find a way to get a high quality recording - comment or add an answer!

 public void startSpeechRecognition() { // Fire an intent to start the speech recognition activity. Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH); // secret parameters that when added provide audio url in the result intent.putExtra("android.speech.extra.GET_AUDIO_FORMAT", "audio/AMR"); intent.putExtra("android.speech.extra.GET_AUDIO", true); startActivityForResult(intent, "<some code you choose>"); } // handle result of speech recognition @Override public void onActivityResult(int requestCode, int resultCode, Intent data) { // the resulting text is in the getExtras: Bundle bundle = data.getExtras(); ArrayList<String> matches = bundle.getStringArrayList(RecognizerIntent.EXTRA_RESULTS) // the recording url is in getData: Uri audioUri = data.getData(); ContentResolver contentResolver = getContentResolver(); InputStream filestream = contentResolver.openInputStream(audioUri); // TODO: read audio file from inputstream } 
+15
Jun 25 '14 at 8:54
source share

The last time I checked, Google Keep installed these additional features:

  • android.speech.extra.GET_AUDIO_FORMAT: audio / AMR
  • android.speech.extra.GET_AUDIO: true

They are not documented as part of the documentation for Android, so they are not an Android API. In addition, Google Keep does not rely on the recognition intent to consider these additional features. It would be nice if such add-ons were popularized and documented by Google.

To find out what add-ons Google Keep specified when calling RecognizerIntent , run an application that responds to RecognizerIntent and print out any additional data that it receives. You can also install KΓ΅nele ( http://kaljurand.imtqy.com/K6nele/ ), which is an implementation of RecognizerIntent . When KΓ΅nele launches Google Keep, then tap and hold the settings icon in the form of a key. It shows some technical details about the caller and also includes incoming add-ons.

@Iftah's answer explains how Google Keep returns audio to the RecognizerIntent caller.

+8
Apr 14 '14 at 21:34
source share

I got this answer from here, I checked the dates and saw that it was published a few days after your message, so I decided that you missed it. Speech recognition and audio recording in Android at a time

one dude says there:

I have a solution that works well to recognize speech and audio recording. Here ( https://github.com/katchsvartanian/voiceRecognition ) is a link to a simple Android project that I created to show how the solution works. In addition, I put several screens in the project to illustrate the application.

I will try to briefly explain the approach that I used. I combined two functions in this project: Google Speech API and Flac entry.

The Google Speech API is called through HTTP connections. Mike Pulz gives more details about the API:

"(...) the new [Google] API is a full duplex streaming API. means that it actually uses two HTTP connections - one POST request to download the content as a" live "fragmented stream, and the second GET to request access to the results, which it makes much more sense for longer sound samples or streaming audio. "

However, this API must receive a FLAC sound file for it to work properly. This makes us move on to the second part: writing flags

I implemented the Flac entry in this project by extracting and adapting some pieces of code and libraries from an open source application called AudioBoo. AudioBoo uses native code to record and play flac format.

Thus, you can record flac sound, send it to the Google Speech API, receive text and play the recorded sound.

The project that I created has basic principles to make it work for specific situations. To make it work in another scenario, you need to get the Google Speech API key, which is obtained as part of the Google Chromium-dev group. I left one key in this project to show its work, but I will delete it in the end. If someone needs more information about this, let me know because I cannot post more than two links in this post.

+3
May 01 '14 at 12:58
source share



All Articles