I was looking for something similar and wrote a dumb answer here that was deleted. I had some ideas, but they really did not write them properly. Removing gave me that the internet bruise is the pride of the ego, so I decided to try the problem and I think it worked!
In fact, trying to find a real answer to Adam Davis’s question is very difficult, but making a human-style arrangement (looking at the first source, ignoring the echo or treating them as sources) is not so bad, I think, although I'm not a processing specialist signals by any means.
I read this and this . This made me realize that the problem is to find the time shift (cross-correlation) between the two signals. From there, you calculate the angle using the speed of sound. Please note that you will get two solutions (front and back).
The key information I read was in this answer and others on the same page that talks about how to quickly convert Fourier to scipy in order to find the cross-correlation curve.
Basically, you need to import the wave file in python. Cm. .
If your wave file (input) is a tuple with two numpy arrays (left, right), with zero margin, at least as long as it is (so that, apparently, its circular alignment), the code follows from Gustavo's answer. I think you need to admit that this ffts makes an assumption about time invariance, which means that if you want to get some kind of time-based signal tracking, you need to "bite" small samples of data.
I have cited the following code from the sources mentioned. It will display a graph showing the estimated time delay, in frames, from left to right (negative / positive). To convert to actual time, divide by the sampling rate. If you want to know which angle you need:
- Suppose everything is on a plane (without height factor)
- forget the difference between the front and back sound (you cannot tell)
You would also like to use the distance between the two microphones to make sure you are not getting an echo (the delay time is longer than with a 90-degree delay).
I understand that I took a lot of borrowed here, so thanks to all those who inadvertently contributed!
import wave import struct from numpy import array, concatenate, argmax from numpy import abs as nabs from scipy.signal import fftconvolve from matplotlib.pyplot import plot, show from math import log def crossco(wav): """Returns cross correlation function of the left and right audio. It uses a convolution of left with the right reversed which is the equivalent of a cross-correlation. """ cor = nabs(fftconvolve(wav[0],wav[1][::-1])) return cor def trackTD(fname, width, chunksize=5000): track = []
I tried this using some stereo sound that I found in equilogy . I used an example car (stereo file). He produced this .
To do this on the fly, I think you need to have an incoming stereo source that you could “listen” for a short time (I used 1000 frames = 0.0208 s), and then calculate and repeat.
[edit: found you can easily use the fft convolve function using the inverted time series of one of the two to make a correlation]