Create a thumbnail for an arbitrary audio file

I want to represent an audio file in an image with a maximum size of 180 × 180 pixels.

I want to generate this image so that it somehow gives an idea of ​​the audio file, think of it as a sound form SoundCloud (amplitude graph)?

Screenshot of Soundcloud's player

I wonder if any of you have anything for this. I searched a bit, mostly “audio visualization” and “soundtrack”, but I did not find anything useful.

I first posted this on ux.stackexchange.com , this is my attempt to contact any programmers working on this.

+6
source share
3 answers

Take a look at this blog post about wav2png.py .

+2
source

You can also break the sound into pieces and measure the RMS (measure of volume). let's say you want an image that is 180 pixels wide.

I will use pydub , a lightweight shell that I wrote around std lib wave mode:

 from pydub import AudioSegment # first I'll open the audio file sound = AudioSegment.from_mp3("some_song.mp3") # break the sound 180 even chunks (or however # many pixels wide the image should be) chunk_length = len(sound) / 180 loudness_of_chunks = [] for i in range(180): start = i * chunk_length end = chunk_start + chunk_length chunk = sound[start:end] loudness_of_chunks.append(chunk.rms) 

the for loop can be represented as the following list comprehension, I just wanted it to be clear:

 loudness_of_chunks = [ sound[ i*chunk_length : (i+1)*chunk_length ].rms for i in range(180)] 

Now we just have to think about whether to scale the RMS to a scale of 0 - 180 (since you want the image to be 180 pixels tall)

 max_rms = max(loudness_of_chunks) scaled_loudness = [ (loudness / max_rms) * 180 for loudness in loudness_of_chunks] 

I will leave a drawing of the actual pixels to you, I am not very experienced with PIL or ImageMagik: /

+3
source

Based on Jiaaro's answer (thanks for writing pydub!), And built my two cents for web2py here:

 def generate_waveform(): img_width = 1170 img_height = 140 line_color = 180 filename = os.path.join(request.folder,'static','sounds','adg3.mp3') # first I'll open the audio file sound = pydub.AudioSegment.from_mp3(filename) # break the sound 180 even chunks (or however # many pixels wide the image should be) chunk_length = len(sound) / img_width loudness_of_chunks = [ sound[ i*chunk_length : (i+1)*chunk_length ].rms for i in range(img_width) ] max_rms = float(max(loudness_of_chunks)) scaled_loudness = [ round(loudness * img_height/ max_rms) for loudness in loudness_of_chunks] # now convert the scaled_loudness to an image im = Image.new('L',(img_width, img_height),color=255) draw = ImageDraw.Draw(im) for x,rms in enumerate(scaled_loudness): y0 = img_height - rms y1 = img_height draw.line((x,y0,x,y1), fill=line_color, width=1) buffer = cStringIO.StringIO() del draw im = im.filter(ImageFilter.SMOOTH).filter(ImageFilter.DETAIL) im.save(buffer,'PNG') buffer.seek(0) return response.stream(buffer, filename=filename+'.png') 
+1
source

Source: https://habr.com/ru/post/907980/


All Articles