开发者

voice communication for python help!

开发者 https://www.devze.com 2023-01-03 12:19 出处:网络
I\'m currently trying to write a voicechat program in python. All tips/trick is welcome to do this. So far I found pyAudio to be a wrapper of PortAudio. So I played around with that and got an input

I'm currently trying to write a voicechat program in python. All tips/trick is welcome to do this.

So far I found pyAudio to be a wrapper of PortAudio. So I played around with that and got an input stream from my microphone to be played back to my speakers. Only RAW of course.

But I can't send RAW-data over the netowrk (due the size duh), so I'm looking for a way to encode it. And I searched around the 'net and stumbled over this speex-wrapper for python. It seems to good to be true, and believe me, it was.

You see in pyAudio you can set the size of the chunks you want to take from your input audiobuffer, and in that sample code on the link, it's set to 320. Then when it's encoded, its like ~40 bytes of data per chunk, which is fairly acceptable I guess. And now for the problem.

I start a sample program which just takes the input stream, encodes the chunks, decodes them and play them (not sending over the network due testing). If I just let my computer idle and run this program it works great, but as soon as I do something, i.e start Firefox or something, the audio input buffer gets all clogged up! It just grows and then it all crashes and gives me an overflow error on the buffer..

OK, so why am I just taking 320 bytes of the stream? I could just take like 1024 bytes or something and that will easy the pressure on the buffer. BUT. If I give speex 1024 bytes of data to encode/decode, it either crashes and says that thats too big for its buffer. OR it encodes/decodes it, but the sound is very noisy and "choppy" as if it only encoded a tiny bit of that 1024 chunk a开发者_运维百科nd the rest is static noise. So the sound sounds like a helicopter, lol.

I did some research and it seems that speex only can convert 320 bytes of data at time, and well, 640 for wide-band. But that's the standard? How can I fix this problem? How should I construct my program to work with speex? I could use a middle-buffer tho that takes all available data to read from the buffer, then chunk this up in 320 bits and encode/decode them. But this takes a bit longer time and seems like a very bad solution of the problem..

Because as far as I know, there's no other encoder for python that encodes the audio so it can be sent over the network in acceptable small packages, or? I've been googling for three days now.

Also there is this pyMedia library, I don't know if its good to convert to mp3/ogg for this kind of software.

Thank in in advance for reading this, hope anyone can help me! (:


You could try Huffman encoding, it's a pretty neat concept. I don't know how fast you could make it, but I'm sure if you created your own C/C++ module you could make it a lot faster.

Of course, there may be already some modules out there that do exactly what you need - I've just never used them, so I'm completely unaware of their existence.

0

精彩评论

暂无评论...
验证码 换一张
取 消