Interpreting a .WAV File [Python]

Question

Interpreting a .WAV File [Python]

I am trying to process an audio file in python and apply a Low Pass filter to remove some background noise. Currently, I can successfully upload the file and create an array with its data values:

class AudioModule: def __init__(self, fname=""): self.stream = wave.open(fname, 'r') self.frames = [] def build(self): self.stream.rewind() for x in range(self.stream.getnframes()): self.frames.append(struct.unpack('B',self.stream.readframes(1)))

I used struct.unpack ('B' ..) for this particular file. The downloadable audio file displays the following specifications:

 nchannels: 1 sampwidth: 1 framerate: 6000

I know that sampwidth indicates the width in bytes returned by each readframes (1) call. When loading an array, it contains values as shown (from 128 to 180):

 >>> r.frames[6000:6025] [(127,), (127,), (127,), (127,), (128,), (128,), (128,), (128,), (128,), (128,), (128,), (128,), (128,), (128,), (128,), (128,), (128,), (128,), (128,), (128,), (128,), (128,), (128,), (128,), (128,)]

Question: What are these numbers? Other audio files with a larger sample width give completely different numbers. My goal is to trim certain frequencies from the audio file, unfortunately, I know very little about this and don’t know how these values relate to the frequency.

What are the best ways to remove all values above a certain frequency threshold?

In addition, the values are packaged back into another file as follows:

 def store(self, fout=""): out = wave.open(fout, 'w') nchannels = self.stream.getnchannels() sampwidth = self.stream.getsampwidth() framerate = self.stream.getframerate() nframes = len(self.frames) comptype = "NONE" compname = "not compressed" out.setparams((nchannels, sampwidth, framerate, nframes, comptype, compname)) if nchannels == 1: for f in self.frames: data = struct.pack('B', f[0]) out.writeframes(data) elif nchannels == 2: for f in self.frames: data = struct.pack('BB', f[0], f[1]) out.writeframes(data) out.close()

+4

python python-3.x audio

diverges Jul 16 '13 at 12:45

source share

1 answer

zhangyangyu · Accepted Answer · 2013-07-16T13:15:13+0000

I think the numbers are abstract extensions of the vibration of the membrane or volume. A higher value means greater vibration of the membrane. You can read it here.

And the sample width is a range of volume. For different types of samples, the sample widths are different. For example, if the sample width is 1 bit, we can describe the sound as sound or not. Thus, usually a higher sampling width, the sound has a higher quality. For more information on sample widths, you can read Sampling Rate and Bitrate: Gut Digital Sound .

And the singles stored in the audio file are in the time domain. It does not represent frequency. If you want to get values in the frequency domain, you can perform FFT in the resulting array.

I recommend using numpy to perform audio. For example, to get the required array, you just need to use np.fromstring . And related functions like FFT are already defined. Many samples and documents can be found on Google.

Interpreting a .WAV File [Python]

More articles: