FFT Frequency Shift

I worked on a frequency converter using the primitive FFT algorithm supplied by Rosetta Code. I understand that to shift the sample signal frequency, one applies the FFT to the original sound, multiplies the frequency of each resulting sinusoidal frequency by the frequency shift (user-defined), and then combines the sine waves again. When I run my algorithm, the output is of extremely low quality, as if the algorithm did not have enough sine waves to correctly reproduce the signal first. The algorithm is implemented in the class in the header file and is called (correctly) elsewhere.

#include <complex> #include <valarray> typedef std::complex<double> Complex; typedef std::valarray<Complex> CArray; class FrequencyShifter { float sampleRate; public: FrequencyShifter() { } void setSampleRate(float inSampleRate) { sampleRate = inSampleRate; } double abs(double in0) { if (in0>=0) return in0; else return -in0; } void fft(CArray& x) { const size_t N = x.size(); if (N <= 1) return; // divide CArray even = x[std::slice(0, N/2, 2)]; CArray odd = x[std::slice(1, N/2, 2)]; // conquer fft(even); fft(odd); // combine for (size_t k = 0; k < N/2; ++k) { Complex t = std::polar(1.0, -2 * PI * k / N) * odd[k]; x[k ] = even[k] + t; x[k+N/2] = even[k] - t; } } double convertToReal(double im, double re) { return sqrt(abs(im*im - re*re)); } void processBlock(float *inBlock, const int inFramesToProcess, float scale) { //inFramesToProcess is the amount of samples in inBlock Complex *copy = new Complex[inFramesToProcess]; for (int frame = 0; frame<inFramesToProcess; frame++) { copy[frame] = Complex((double)inBlock[frame], 0.0); } CArray data(copy, inFramesToProcess); fft(data); const float freqoffsets = sampleRate/inFramesToProcess; for (float x = 0; x<data.size()/2; x++) { for (float frame = 0; frame<inFramesToProcess; frame++) { inBlock[(int)frame] = (float)(convertToReal(data[(int)x].imag(), data[(int)x].real())*sin(freqoffsets*x*frame*scale)); } } } }; 

I assume that part of the problem is that I only include sampleRate/inFramesToProcess to cover the sine waves. Will sending large audio files (thus more *inBlock and inFramesToProcess s) make the sound less grainy? How can I perform without just changing the values ​​or lengths of the arguments?

+5
source share
1 answer

Below is an updated version of processBlock with a number of settings necessary to implement the frequency shift, which I will describe below:

 void processBlock(float *inBlock, const int inFramesToProcess, float scale) { //inFramesToProcess is the amount of samples in inBlock Complex *copy = new Complex[inFramesToProcess]; for (int frame = 0; frame<inFramesToProcess; frame++) { copy[frame] = Complex((double)inBlock[frame], 0.0); } CArray data(copy, inFramesToProcess); fft(data); const float freqoffsets = 2.0*PI/inFramesToProcess; const float normfactor = 2.0/inFramesToProcess; for (int frame = 0; frame<inFramesToProcess; frame++) { inBlock[frame] = 0.5*data[0].real(); for (int x = 1; x<data.size()/2; x++) { float arg = freqoffsets*x*frame*scale; inBlock[frame] += data[x].real()*cos(arg) - data[x].imag()*sin(arg); } inBlock[frame] *= normfactor; } } 

Output

The spectrum that you get from the FFT is complex-valued, which can be considered as providing a representation of your signals in terms of sine and cosine waves. Reconstructing the formula in the time domain can be done using the inverse transformation, which will be given by the ratio: enter image description here

Using the symmetry of the frequency spectrum, this can be expressed as:

enter image description here

or equivalently:

enter image description here

As you can see, the term in the index 0 and N/2 are special cases with purely real coefficients in the frequency domain. For simplicity, assuming that the spectrum does not go up to N/2 , you can reject the term N/2 and still get a reasonable approximation. For other conditions, you will receive a contribution that can be implemented as

 normfactor = 2.0/inFramesToProcess; normfactor*(data[x].real()*cos(arg) - data[x].imag()*sin(arg)) 

Of course, you will need to add all these contributions to the final inBlock[frame] buffer, and not just rewrite the previous results:

 inBlock[frame] += normfactor*(data[x].real()*cos(arg) - data[x].imag()*sin(arg)); // ^^ 

Note that normalization can be performed in the final result after the loop to reduce the number of multiplications. In this case, we must pay special attention to the terms of direct current with an index of 0 (which has a coefficient of 1/N instead of 2/N ):

 inBlock[frame] = 0.5*data[0].real(); for (int x = 1; x<data.size()/2; x++) { float arg = freqoffsets*x*frame*scale; inBlock[frame] += data[x].real()*cos(arg) - data[x].imag()*sin(arg); } inBlock[frame] *= normfactor; 

Finally, when generating tones, the phase argument arg to sin and cos should be 2*pi*k*n/inFramesToProcess (before applying the scale factor), where n index of the sample in the time domain and k is the index of the frequency domain. The end result is that the calculated freqoffsets frequency freqoffsets must be really 2.0*PI/inFramesToProcess .

Notes

  • The FFT algorithm works under the assumption that your base signal in the time domain is periodic with a period of the length of your block. As a result, there may be audible gaps between the blocks.
  • Future readers should be aware that this does not shift the spectrum by a constant amount, but rather squish or expand the spectrum as the frequencies are scaled by the multiplicative factor. For example, a signal that includes components of 100-200 Hz can be compressed to 75-150 Hz by 0.75 times. Pay attention to how the lower limit was shifted by 25 Hz, and the upper limit decreased by 50 Hz.
+2
source

Source: https://habr.com/ru/post/1263598/


All Articles