How to read Ogg or MP3 sound files on a TensorFlow chart?

Question

How to read Ogg or MP3 sound files on a TensorFlow chart?

I saw image decoders like those tf.image.decode_pngin TensorFlow, but what about reading audio files (WAV, Ogg, MP3, etc.)? Is this possible without TFRecord?

eg. something like this :

filename_queue = tf.train.string_input_producer(['my-audio.ogg'])
reader = tf.WholeFileReader()
key, value = reader.read(filename_queue)
my_audio = tf.audio.decode_ogg(value)

+4

tensorflow

Carl Thomé Dec 12 '16 at 21:11

source share

1 answer

sygi · Accepted Answer · 2016-12-12T22:15:06+0000

Yes, there are special decoders in the tensorflow.contrib.ffmpeg package . To use it, you must first install ffmpeg .

Example:

audio_binary = tf.read_file('song.mp3')
waveform = tf.contrib.ffmpeg.decode_audio(audio_binary, file_format='mp3', samples_per_second=44100)

How to read Ogg or MP3 sound files on a TensorFlow chart?

More articles: