How to get sound envelope using python?

Hi, I am new to python, as well as with the analysis of audio signals. I'm trying to get an envelope about a birth song (zebra finch). It has very fast signal fluctuations, and I tried with a different approach. For example, I tried to build a signal and get an envelope with the following code base on other examples that I found (I added comments to the code to understand this):

#Import the libraries from pylab import * import numpy import scipy.signal.signaltools as sigtool import scipy, pylab from scipy.io import wavfile import wave, struct import scipy.signal as signal #Open the txt file and read the wave file (also save it as txt file) f_out = open('mike_1_44100_.txt', 'w') w = scipy.io.wavfile.read("mike_1_44100_.wav") #here your sound file a=w[1] f_out.write('#time #z' + '\n') #I print to check print 'vector w' print w[0],w[1] print w i=w[1].size p=numpy.arange(i)*0.0000226 #to properly define the time signal with the sample rate print 'vector p:' print px=numpy.dstack([p,a]) print 'vector x:' print x[0] #saving file numpy.savetxt('mike_1_44100_.txt',x[0]) f_out.close() print 'i:' print i # num is the number of samples in the resampled signal. num= np.ceil(float(i*0.0000226)/0.0015) print num y_resample, x_resample = scipy.signal.resample(numpy.abs(a),num, p,axis=0, window=('gaussian',150)) #y_resample, x_resample = scipy.signal.resample(numpy.abs(a), num, p,axis=-1, window=0) #Aplaying a filter W1=float(5000)/(float(44100)/2) #the frequency for the cut over the sample frequency (b, a1) = signal.butter(4, W1, btype='lowpass') aaa=a slp =1* signal.filtfilt(b, a1, aaa) #Taking the abs value of the signal the resample and finaly aplying the hilbert transform y_resample2 =numpy.sqrt(numpy.abs(np.imag(sigtool.hilbert(slp, axis=-1)))**2+numpy.abs(np.real(sigtool.hilbert(slp, axis=-1)))**2) print 'x sampled' #print x_resample print 'y sampled' #print y_resample xx=x_resample #[0] yy=y_resample #[1] #ploting with some style plot(p,a,label='Time Signal') #to plot amplitud vs time #plot(p,numpy.abs(a),label='Time signal') plot(xx,yy,label='Resampled time signal Fourier technique Gauss window 1.5 ms ', linewidth=3) #plot(ww,label='Window', linewidth=3) #plot(p,y_resample2,label='Hilbert transformed sime signal', linewidth=3) grid(True) pylab.xlabel("time [s]") pylab.ylabel("Amplitde") legend() show() 

Here I tried two things: the first uses the resample function from scipy to get the envelope, but I have some problem with the signal amplitude, which I still do not understand (I uploaded the image obtained using the Fourier method, but the system does not allow me):

Secondly, use the Hilbert transform to obtain the envelope (now I loaded the image with the Hilbert transform again when the system does not allow me). You can run my code and get two images. But badly put this link http://ceciliajarne.web.unq.edu.ar/?page_id=92&preview=true

Now the envelope will work again. I tried to filter the signal, as I saw in some examples, but my signal is weakened, and I can not get the envelope. Can someone help me with my code or with a better idea to get an envelope? You can use any bird song as an example (I can give you mine), but I need to see what happens with complex sounds are not simple signals, because they are very different (with simple sounds both methods are okay).

I also tried to adapt the code that I found in: http://nipy.org/nitime/examples/mtm_baseband_power.html

But I can’t get the right parameters for my signal, and I don’t understand the modulation part. I already ask the code developers and waited for an answer.

+3
source share
1 answer

Since with a bird song the “modulation frequency” is likely to be much lower than the “carrier frequency” even with a rapidly changing amplitude, approximating the envelope can be obtained by taking the absolute value of your signal, and then applying a moving average filter with a length of 20 ms.

And yet, would you not be interested in frequency variations in order to adequately characterize the song? In this case, the Fourier transform over a moving window will give you much more information, namely the approximate frequency content as a function of time. This is what we humans hear and helps us distinguish between species of birds.

I don’t have access to the link you sent me:

"There is no problem with the prevailing Borradors."

If you do not want attenuation, you should not use the Butterworth filter and not take the moving average, but use peak detection instead.

Moving average: each output sample is an average absolute value, for example. 50 previous input samples. The output will be weakened.

Peak detection: each output sample is the maximum value of the absolute value, for example. 50 previous input samples. The output will not be weakened. You can then turn off the filter to get rid of the remaining riple ladder.

You are wondering why, for example, the Butterworth filter will attenuate your signal. This hardly does if your cutoff frequency is high enough, but it just SEEMS is greatly attenuated. Your input signal is not the sum of the carrier (whistle) and modulation (envelope), but the product. Filtering limits the frequency content. What remains is the frequency components (terms), not the factors. You see the weakened modulation (envelope), because this frequency component is really present in your signal, MUCH more than the original envelope, since it was not added to your medium, but multiplied by it. Since the sinusoidal medium by which your envelope is multiplied does not always have the maximum value, the envelope will be “weakened” by the modulation process, not by filtering analysis.

In short: if you directly want a (multiplicative) envelope rather than an (additive) frequency component due to envelope modulation (multiplication), use the peak detection approach.

The peak detection algorithm in the Pythonish pseudo-code to get this idea.

 # Untested, but apart from typos this should work fine # No attention paid to speed, just to clarify the algorithm # Input signal and output signal are Python lists # Listcomprehensions will be a bit faster # Numpy will be a lot faster def getEnvelope (inputSignal): # Taking the absolute value absoluteSignal = [] for sample in inputSignal: absoluteSignal.append (abs (sample)) # Peak detection intervalLength = 50 # Experiment with this number, it depends on your sample frequency and highest "whistle" frequency outputSignal = [] for baseIndex in range (intervalLength, len (absoluteSignal)): maximum = 0 for lookbackIndex in range (intervalLength) maximum = max (absoluteSignal [baseIndex - lookbackIndex], maximum) outputSignal.append (maximum) return outputSignal 
+3
source

Source: https://habr.com/ru/post/1237967/


All Articles