I've been looking to into audio analysis. I have a WAV stream read into memory and I neeed to perform various functions on the data, such as an FFT. I've been reading the theory for this but I'm not sure I'm reading it correctly. When reading the stream, I get a bunch of n开发者_如何学Cumbers outputted, which I guess is the sampled data (at 44100 hz). Do I perform all the functions on this very stream? So, for a window of 1024 samples, do I simply get the first 1024 numbers from my stream? Then do I perform an FFT and all the other functions on this 1024 set and repeat for the rest of the stream?
I'm beginning to understand the theory of it, and the idea of summing the samples etc... but I'm not sure what this means in implementation terms.
Edit - To clarify the stream values I get, the numbers are along the lines of -0.432,-0.065...
.
This is just an brief overview of what you can do. For details I would suggest you look into some literature.
Before applying FFT the audio signal needs to be pre processed or windowed. Let's say you are taking a window(hanning etc.), the window function will be applied on the raw-audio with some 'overlap' to take care of edge effects. you can take a convenient window size of 1024 for example. After windowing you can take FFT per 1024 (pre-processed) samples.
I suggest you use MATLAB. That will make your task simple.
精彩评论