How to get microphone sound in python and process it on the fly?

Hello,

I am trying to write a program in Python that will print a line every time it receives a tap in the microphone. When I say click, I mean loud sudden noise or something like that.

I searched in SO and found this post: Having recognized the tone of sound

I think the PyAudio library will meet my needs, but I'm not quite sure how to make my program wait for an audio signal (real-time microphone monitoring), and when I have a way to process it (I need to use the Fourier Transform, as indicated in the above message)?

Thanks in advance for any help you could give me.

+45
python microphone
Dec 20 '09 at 20:01
source share
2 answers

If you are using LINUX, you can use pyALSAAUDIO . For windows, we have PyAudio , and there is also a library called SoundAnalyse .

I found an example for Linux here :

#!/usr/bin/python ## This is an example of a simple sound capture script. ## ## The script opens an ALSA pcm for sound capture. Set ## various attributes of the capture, and reads in a loop, ## Then prints the volume. ## ## To test it out, run it and shout at your microphone: import alsaaudio, time, audioop # Open the device in nonblocking capture mode. The last argument could # just as well have been zero for blocking mode. Then we could have # left out the sleep call in the bottom of the loop inp = alsaaudio.PCM(alsaaudio.PCM_CAPTURE,alsaaudio.PCM_NONBLOCK) # Set attributes: Mono, 8000 Hz, 16 bit little endian samples inp.setchannels(1) inp.setrate(8000) inp.setformat(alsaaudio.PCM_FORMAT_S16_LE) # The period size controls the internal number of frames per period. # The significance of this parameter is documented in the ALSA api. # For our purposes, it is suficcient to know that reads from the device # will return this many frames. Each frame being 2 bytes long. # This means that the reads below will return either 320 bytes of data # or 0 bytes of data. The latter is possible because we are in nonblocking # mode. inp.setperiodsize(160) while True: # Read data from device l,data = inp.read() if l: # Return the maximum of the absolute value of all samples in a fragment. print audioop.max(data, 2) time.sleep(.001) 
+34
Dec 20 '09 at 21:10
source share
— -

... and when did I get one way to handle it (do I need to use the Fourier transform as indicated in the above post)?

If you want to click, then I think you're interested in amplitude more than frequency. Therefore, Fourier transforms are probably not useful for your specific purpose. You probably want to make the current measurement a short-term (say, 10 ms) input amplitude and determine when it suddenly increases by a certain delta. You will need to configure the parameters:

  • what is a “short-term” amplitude measurement
  • What is the delta increase you are looking for
  • how fast should a triangle change

Although I said that you are not interested in frequency, you can do some filtering first to filter out especially low-frequency and high-frequency components. This can help you avoid some “false positives”. You can do this with a FIR or IIR digital filter; Fourier is not required.

+5
Dec 20 '09 at 23:42
source share



All Articles