How to read Ogg or MP3 sound files on a TensorFlow chart?

I saw image decoders like those tf.image.decode_pngin TensorFlow, but what about reading audio files (WAV, Ogg, MP3, etc.)? Is this possible without TFRecord?

eg. something like this :

filename_queue = tf.train.string_input_producer(['my-audio.ogg'])
reader = tf.WholeFileReader()
key, value = reader.read(filename_queue)
my_audio = tf.audio.decode_ogg(value)
+4
source share
1 answer

Yes, there are special decoders in the tensorflow.contrib.ffmpeg package . To use it, you must first install ffmpeg .

Example:

audio_binary = tf.read_file('song.mp3')
waveform = tf.contrib.ffmpeg.decode_audio(audio_binary, file_format='mp3', samples_per_second=44100)
+2
source

Source: https://habr.com/ru/post/1663556/


All Articles