I want to know can we convert an audio file into binary format, this i want becoz i have a two audio file one of 59 sec and other of the same song of 10 sec, and i have to see that if they get converted to binary then the 10 sec audio file is a subset of 59 sec of audio file or not, on the basis of what i have to store a 59 sec audio file in database and match the 10 sec audio file with its binary format with the other 59 sec audio file.
Can anybody p开发者_开发问答rovide me a better solution on that.
Thanks a lot in advance
You're on a very wrong track there.
First of all, audio is always binary, whatever format (WAV, MP3, OGG, etc) it's encoded in.
Secondly, you will hardly get two 100% identical representations of the same audio signal in any raw format (like PCM), unless they were produced from the same original data and subjected to the exact same transformations. You almost certainly won't be able to do a simple substr
to find that one audio signal is "contained within" another, especially if both samples were taken from different sources.
It sounds like you want to create a sort of search database for audio, and are planning to store raw audio data in a database to search for it. Not a good approach. Apart from the above problems, you'll also need a ton of space to store all that raw audio and searching by comparing tons of raw samples to each other will be terrifically slow.
You should a) learn more about audio and digital audio processing before you continue and b) look into acoustic fingerprinting.
MP3 is a lossy audio encoding.
That means that in general two raw (e.g. PCM) audio clips will produce the same MP3 output, if and only if they are identical in every respect. The process is also not reversible - there is no way to get the original raw audio file, at least not down to byte level.
In addition, let's consider this process:
You have a raw audio clip A
You convert it to MP3 and then back to a raw audio clip A2
You cut-off a part B of A, convert it to MP3 and then back to a raw audio clip B2
Unless the clip is something unusual like e.g. absolute silence - something that can only be produced by crafting the source audio file artificially - the probability that B2 will be a subset of A2 is extremely small.
Keep in mind that the process above assumes that the same encoding software and parameters are used, which is not always the case, thus making any matches even more improbable.
In general, what you need is some sort of digital signal processing (DSP) algorithm that will perform an audio similarity check. This is by no means as simple as a simple binary comparison.
The only possible exception to the above is if the shorter clip has been produced using some form of MP3 frame-level editing software. In that case, the raw audio equivalent might be a subset of the longer version.
精彩评论