Python Finds Sound Frequency and Amplitude Over Time

Here is what I would like to do. I would like to find the sound frequency and amplitude of the .wav file every time we say 1 ms of this .wav file and save it to a file. I have a chart with amplitude and amplitude over time, but I cannot determine the frequency of overtime. My ultimate goal is to be able to read the file and use their amplitude to adjust the variables and the frequency to run the variables, which seem to be the easy part. I use numpy, audiolab, matplotlib etc. Using FFT, but I just can't figure it out, any help is appreciated! Thanks!

+6
source share
1 answer

Use STFT with overlapping windows to evaluate the spectrogram. To rid yourself of the need yourself, you can use the specgram method in Matplotlib mlab. It is important to use a sufficiently small window for which the sound is approximately stationary, and the buffer size must be 2 in order to effectively use the common radix-2 frame. 512 samples are sufficient (about 10.67 ms at 48 kbps, or 93.75 Hz per hopper). For a sampling frequency of 48 kbit / s, 464 samples overlap to evaluate a sliding window every 1 ms (i.e., a shift of 48 samples).

Edit:

Here is an example that mlab.specgram uses for an 8 second signal that has 1 tone per second from 2 kHz to 16 kHz. Pay attention to the answer on transients. I enlarged the image by 4 seconds to show the answer in more detail. The frequency shifts exactly after 4 seconds, but a buffer length is required to complete the transient (512 samples, approximately +/- 5 ms). This illustrates the kind of spectral / temporal blur caused by non-stationary transitions when passing through a buffer. In addition, you can see that even with a stationary signal, there is a spectral leakage problem caused by the data window. The Hamming function was used to minimize side leakage lobes, but it also extends the main lobe.

spectrogram

 import numpy as np from matplotlib import mlab, pyplot #Python 2.x: #from __future__ import division Fs = 48000 N = 512 f = np.arange(1, 9) * 2000 t = np.arange(8 * Fs) / Fs x = np.empty(t.shape) for i in range(8): x[i*Fs:(i+1)*Fs] = np.cos(2*np.pi * f[i] * t[i*Fs:(i+1)*Fs]) w = np.hamming(N) ov = N - Fs // 1000 # eg 512 - 48000 // 1000 == 464 Pxx, freqs, bins = mlab.specgram(x, NFFT=N, Fs=Fs, window=w, noverlap=ov) #plot the spectrogram in dB Pxx_dB = np.log10(Pxx) pyplot.subplots_adjust(hspace=0.4) pyplot.subplot(211) ex1 = bins[0], bins[-1], freqs[0], freqs[-1] pyplot.imshow(np.flipud(Pxx_dB), extent=ex1) pyplot.axis('auto') pyplot.axis(ex1) pyplot.xlabel('time (s)') pyplot.ylabel('freq (Hz)') #zoom in at t=4s to show transient pyplot.subplot(212) n1, n2 = int(3.991/8*len(bins)), int(4.009/8*len(bins)) ex2 = bins[n1], bins[n2], freqs[0], freqs[-1] pyplot.imshow(np.flipud(Pxx_dB[:,n1:n2]), extent=ex2) pyplot.axis('auto') pyplot.axis(ex2) pyplot.xlabel('time (s)') pyplot.ylabel('freq (Hz)') pyplot.show() 
+7
source

Source: https://habr.com/ru/post/894471/


All Articles