Reversible STFT and ISTFT in Python

Is there any universal form of short-term Fourier transform with the corresponding inverse transform built into SciPy or NumPy or something else?

Here's the pyplot specgram function in matplotlib that calls ax.specgram() , which calls mlab.specgram() , which calls _spectral_helper() :

 #The checks for if y is x are so that we can use the same function to #implement the core of psd(), csd(), and spectrogram() without doing #extra calculations. We return the unaveraged Pxy, freqs, and t. 

but

This is a helper function that implements a commonality between 204 #psd, csd and spectrogram. it is NOT intended to be used outside mlab

I am not sure if this can be used for STFT and ISTFT. Is there anything else, or should I translate something like these MATLAB functions ?

I know how to write my own special implementation; I'm just looking for something fully functional that can handle various window functions (but has a normal default value), is completely reversible with COLA windows ( istft(stft(x))==x ) checked by several people, without errors, handles well ends and zero padding, fast implementation of RFFT for real input, etc.

+46
python scipy fft signal-processing
Mar 17 '10 at 1:12
source share
9 answers

I'm a little late for this, but the scipy implemented has a built-in istft with 0.19.0

+2
May 08 '17 at 14:48
source share

Here is my Python code simplified for this answer:

 import scipy, pylab def stft(x, fs, framesz, hop): framesamp = int(framesz*fs) hopsamp = int(hop*fs) w = scipy.hanning(framesamp) X = scipy.array([scipy.fft(w*x[i:i+framesamp]) for i in range(0, len(x)-framesamp, hopsamp)]) return X def istft(X, fs, T, hop): x = scipy.zeros(T*fs) framesamp = X.shape[1] hopsamp = int(hop*fs) for n,i in enumerate(range(0, len(x)-framesamp, hopsamp)): x[i:i+framesamp] += scipy.real(scipy.ifft(X[n])) return x 

Notes:

  • List comprehension is a little trick that I like to use to simulate block signal processing in numpy / scipy. This is similar to blkproc in matlab. Instead of a for loop, I apply a command (e.g. fft ) to each frame of the signal inside the list comprehension, and then scipy.array passes it to a 2D array. I use this to create spectrograms, chromograms, MFCC grams, and more.
  • In this example, I use the naive overlap and add method in istft . To restore the original signal, the sum of the successive window functions should be constant, preferably equal to unity (1.0). In this case, I selected a Hann (or hanning ) window and an overlap of 50%, which works fine. See discussion for details.
  • There are probably more fundamental ways to calculate ISTFT. This example is mainly for learning.

Test:

 if __name__ == '__main__': f0 = 440 # Compute the STFT of a 440 Hz sinusoid fs = 8000 # sampled at 8 kHz T = 5 # lasting 5 seconds framesz = 0.050 # with a frame size of 50 milliseconds hop = 0.025 # and hop size of 25 milliseconds. # Create test signal and STFT. t = scipy.linspace(0, T, T*fs, endpoint=False) x = scipy.sin(2*scipy.pi*f0*t) X = stft(x, fs, framesz, hop) # Plot the magnitude spectrogram. pylab.figure() pylab.imshow(scipy.absolute(XT), origin='lower', aspect='auto', interpolation='nearest') pylab.xlabel('Time') pylab.ylabel('Frequency') pylab.show() # Compute the ISTFT. xhat = istft(X, fs, T, hop) # Plot the input and output signals over 0.1 seconds. T1 = int(0.1*fs) pylab.figure() pylab.plot(t[:T1], x[:T1], t[:T1], xhat[:T1]) pylab.xlabel('Time (seconds)') pylab.figure() pylab.plot(t[-T1:], x[-T1:], t[-T1:], xhat[-T1:]) pylab.xlabel('Time (seconds)') 

STFT of 440 Hz sinusoidISTFT of beginning of 440 Hz sinusoidISTFT of end of 440 Hz sinusoid

+59
Jul 31 '11 at 19:30
source share

Here is the STFT code I'm using. STFT + ISTFT gives a perfect reconstruction (even for the first frames). I slightly modified the code given here by Steve Thoa: here the magnitude of the recovered signal is the same as the input signal.

 import scipy, numpy as np def stft(x, fftsize=1024, overlap=4): hop = fftsize / overlap w = scipy.hanning(fftsize+1)[:-1] # better reconstruction with this trick +1)[:-1] return np.array([np.fft.rfft(w*x[i:i+fftsize]) for i in range(0, len(x)-fftsize, hop)]) def istft(X, overlap=4): fftsize=(X.shape[1]-1)*2 hop = fftsize / overlap w = scipy.hanning(fftsize+1)[:-1] x = scipy.zeros(X.shape[0]*hop) wsum = scipy.zeros(X.shape[0]*hop) for n,i in enumerate(range(0, len(x)-fftsize, hop)): x[i:i+fftsize] += scipy.real(np.fft.irfft(X[n])) * w # overlap-add wsum[i:i+fftsize] += w ** 2. pos = wsum != 0 x[pos] /= wsum[pos] return x 
+9
Dec 05 '13 at 19:38
source share

librosa.core.stft and istft look pretty similar to what I was looking for, although they weren't there at the time:

librosa.core.stft(y, n_fft=2048, hop_length=None, win_length=None, window=None, center=True, dtype=<type 'numpy.complex64'>)

They do not invert exactly; the ends narrow.

+3
Jan 20 '15 at 2:32
source share

Found another STFT, but does not have the corresponding inverse function:

http://code.google.com/p/pytfd/source/browse/trunk/pytfd/stft.py

 def stft(x, w, L=None): ... return X_stft 
  • w is a window function in the form of an array
  • L is the overlap in the samples
+1
Mar 18 '10 at 13:54
source share

None of the above answers worked well for OOTB for me. So I cheated on Steve Thoa.

 import scipy, pylab import numpy as np def stft(x, fs, framesz, hop): """ x - signal fs - sample rate framesz - frame size hop - hop size (frame size = overlap + hop size) """ framesamp = int(framesz*fs) hopsamp = int(hop*fs) w = scipy.hamming(framesamp) X = scipy.array([scipy.fft(w*x[i:i+framesamp]) for i in range(0, len(x)-framesamp, hopsamp)]) return X def istft(X, fs, T, hop): """ T - signal length """ length = T*fs x = scipy.zeros(T*fs) framesamp = X.shape[1] hopsamp = int(hop*fs) for n,i in enumerate(range(0, len(x)-framesamp, hopsamp)): x[i:i+framesamp] += scipy.real(scipy.ifft(X[n])) # calculate the inverse envelope to scale results at the ends. env = scipy.zeros(T*fs) w = scipy.hamming(framesamp) for i in range(0, len(x)-framesamp, hopsamp): env[i:i+framesamp] += w env[-(length%hopsamp):] += w[-(length%hopsamp):] env = np.maximum(env, .01) return x/env # right side is still a little messed up... 
+1
May 13, '14 at 21:50
source share

I also found this on GitHub, but it seems to work on pipelines instead of regular arrays:

http://github.com/ronw/frontend/blob/master/basic.py#LID281

 def STFT(nfft, nwin=None, nhop=None, winfun=np.hanning): ... return dataprocessor.Pipeline(Framer(nwin, nhop), Window(winfun), RFFT(nfft)) def ISTFT(nfft, nwin=None, nhop=None, winfun=np.hanning): ... return dataprocessor.Pipeline(IRFFT(nfft), Window(winfun), OverlapAdd(nwin, nhop)) 
0
Mar 17 '10 at 16:13
source share

I think scipy.signal has what you are looking for. It has reasonable defaults, supports multiple window types, etc.

http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.signal.spectrogram.html

 from scipy.signal import spectrogram freq, time, Spec = spectrogram(signal) 
0
Apr 6 '16 at 18:55
source share

If you have access to the C binary library that does what you want, use http://code.google.com/p/ctypesgen/ to create a Python interface for this library.

-3
Aug 02 2018-11-11T00:
source share



All Articles