如何閱讀TensorFlow圖中的Ogg或MP3音頻文件？

我在TensorFlow中看到過像tf.image.decode_png這樣的圖像解碼器，但是如何讀取音頻文件（WAV，Ogg，MP3等）呢？沒有TFRecord有沒有可能？如何閱讀TensorFlow圖中的Ogg或MP3音頻文件？

E.g.像this：

filename_queue = tf.train.string_input_producer(['my-audio.ogg']) 
reader = tf.WholeFileReader() 
key, value = reader.read(filename_queue) 
my_audio = tf.audio.decode_ogg(value)

來源

2016-12-12 Carl Thomé

是的，有專門的解碼器，在封裝tensorflow.contrib.ffmpeg。要使用它，您需要先安裝ffmpeg。

例子：

audio_binary = tf.read_file('song.mp3') 
waveform = tf.contrib.ffmpeg.decode_audio(audio_binary, file_format='mp3', samples_per_second=44100)

來源

2016-12-12 22:15:06 sygi

不錯！謝謝！ –

如何閱讀TensorFlow圖中的Ogg或MP3音頻文件？

回答

相關問題