Android Bluemix does not show speaker tag

I am using IBM bluemix to transcribe some audio, and I want to use speaker recognition APIs.

I installed the recognizer as follows:

private RecognizeOptions getRecognizeOptions() { return new RecognizeOptions.Builder() .continuous(true) .contentType(ContentType.OPUS.toString()) //.model("en-US") .model("en-US_BroadbandModel") .timestamps(true) .smartFormatting(true) .interimResults(true) .speakerLabels(true) .build(); } 

But the returned JSON does not contain a speaker tag. How can I get the speaker tag also returned using the bluemix java API?

My Android VCR is as follows:

 private void recordMessage() { //mic.setEnabled(false); speechService = new SpeechToText(); speechService.setUsernameAndPassword("usr", "pwd"); if(listening != true) { capture = new MicrophoneInputStream(true); new Thread(new Runnable() { @Override public void run() { try { speechService.recognizeUsingWebSocket(capture, getRecognizeOptions(), new MicrophoneRecognizeDelegate()); } catch (Exception e) { showError(e); } } }).start(); Log.v("TAG",getRecognizeOptions().toString()); listening = true; Toast.makeText(MainActivity.this,"Listening....Click to Stop", Toast.LENGTH_LONG).show(); } else { try { capture.close(); listening = false; Toast.makeText(MainActivity.this,"Stopped Listening....Click to Start", Toast.LENGTH_LONG).show(); } catch (Exception e) { e.printStackTrace(); } } } 
+5
source share
1 answer

Based on your example, I wrote an example application and got speaker shortcuts to work.

Make sure you are using the Java-SDK 4.2.1 . In build.gradle add

 compile 'com.ibm.watson.developer_cloud:java-sdk:4.2.1' 

Here is a snippet of code that recognizes the WAV file from the assets folder using WebSockets, intermediate results, and speaker shortcuts.

 RecognizeOptions options = new RecognizeOptions.Builder() .contentType("audio/wav") .model(SpeechModel.EN_US_NARROWBANDMODEL.getName()) .interimResults(true) .speakerLabels(true) .build(); SpeechToText service = new SpeechToText(); service.setUsernameAndPassword("SPEECH-TO-TEXT-USERNAME", "SPEECH-TO-TEXT-PASSWORD"); InputStream audio = loadInputStreamFromAssetFile("speaker_label.wav"); service.recognizeUsingWebSocket(audio, options, new BaseRecognizeCallback() { @Override public void onTranscription(SpeechResults speechResults) { Assert.assertNotNull(speechResults); System.out.println(speechResults.getResults().get(0).getAlternatives().get(0).getTranscript()); System.out.println(speechResults.getSpeakerLabels()); } }); 

Where loadInputStreamFromAssetFile() :

 public static InputStream loadInputStreamFromAssetFile(String fileName){ AssetManager assetManager = getAssets(); // From Context try { InputStream is = assetManager.open(fileName); return is; } catch (IOException e) { e.printStackTrace(); } return null; } 

Application Logs:

 I/System.out: so how are you doing these days I/System.out: so how are you doing these days things are going very well glad to hear I/System.out: so how are you doing these days things are going very well glad to hear I think I mentioned I/System.out: so how are you doing these days things are going very well glad to hear I think I mentioned before that there a company now that I'm I/System.out: so how are you doing these days things are going very well glad to hear I think I mentioned before that there a company now that I'm working with which is very much I/System.out: so how are you doing these days things are going very well glad to hear I think I mentioned before that there a company now that I'm working with which is very much just just myself and Chris now I/System.out: so how are you doing these days things are going very well glad to hear I think I mentioned before that there a company now that I'm working with which is very much just just myself and Chris now you had mentioned that %HESITATION okay I/System.out: so how are you doing these days things are going very well glad to hear I think I mentioned before that there a company now that I'm working with which is very much just just myself and Chris now you had mentioned that %HESITATION okay I/System.out: [{ I/System.out: "confidence": 0.487, I/System.out: "final": false, I/System.out: "from": 0.03, I/System.out: "speaker": 0, I/System.out: "to": 0.34 I/System.out: }, { I/System.out: "confidence": 0.487, I/System.out: "final": false, I/System.out: "from": 0.34, I/System.out: "speaker": 0, I/System.out: "to": 0.54 I/System.out: }, { I/System.out: "confidence": 0.487, I/System.out: "final": false, I/System.out: "from": 0.54, I/System.out: "speaker": 0, I/System.out: "to": 0.63 I/System.out: }, { ...... blah blah blah I/System.out: }, { I/System.out: "confidence": 0.343, I/System.out: "final": false, I/System.out: "from": 13.39, I/System.out: "speaker": 1, I/System.out: "to": 13.84 I/System.out: }] 
0
source

Source: https://habr.com/ru/post/1268540/


All Articles