开发者

Flash SPEEX codec coversion for Google Speech API - a challenge

开发者 https://www.devze.com 2023-04-01 16:08 出处:网络
People have figured out how to use the Google Speech API (Speech-To-Text). I\'m trying to get it working with Flash Speex codec, and I just can\'t figure it out. I\'ve tried inserting frame size byte

People have figured out how to use the Google Speech API (Speech-To-Text). I'm trying to get it working with Flash Speex codec, and I just can't figure it out. I've tried inserting frame size byte before each 160 bytes (as some sources say), but this doesn't work.

So I post a challenge to somehow translate the flash speex bytes for Google Speech API to understand.

Here is basic flex code:

<?xml version="1.0" encoding="utf-8"?>
<s:Application xmlns:fx="http://ns.adobe.com/mxml/2009" 
           xmlns:s="library://ns.adobe.com/flex/spark" 
           creationComplete="init();">
<fx:Script>
    <![CDATA[
        // Speech API info
        // Reference: http://mikepultz.com/2011/03/accessing-google-speech-api-chrome-11/,
        // Reference: https://stackoverflow.com/questions/4361826/does-chrome-have-buil-in-speech-recognition-for-input-type-text-x-webkit-speec
        private static const speechApiUrl:String = "http://www.google.com/speech-api/v1/recognize";
        private 开发者_如何学Gostatic const speechLanguage:String = "en";
        private static const mimeType:String = "audio/x-speex-with-header-byte";
        private static const sampleRate:uint = 8;

        // Sound bytes & mic
        private var soundBytes:ByteArray;
        private var microphone:Microphone;

        // Initial setup        
        private function init():void {
            // Set up the microphone
            microphone = Microphone.getMicrophone();
            // Speech API supports 8khz and 16khz rates
            microphone.rate = sampleRate;
            // Select the SPEEX codec
            microphone.codec = SoundCodec.SPEEX;
            // I don't know what effect this has...
            microphone.framesPerPacket = 1;
        }

        // THIS IS THE CHALLENGE
        // We have the flash speex bytes and we need to translate them so Google API understands
        private function process():void{
            soundBytes.position = 0;

            var processed:ByteArray = new ByteArray();
            processed.endian = Endian.BIG_ENDIAN;
            var frameSize:uint = 160;

            for(var n:uint = 0; n < soundBytes.bytesAvailable / frameSize; n++){
                processed.writeByte(frameSize);

                processed.writeBytes(soundBytes, frameSize * n, frameSize);
            }

            processed.position = 0;

            soundBytes = processed;
        }

        // Sending to Google Speech server
        private function send():void {
            var loader:URLLoader = new URLLoader();

            var request:URLRequest = new URLRequest(speechApiUrl + "?lang=" + speechLanguage);
            request.method = URLRequestMethod.POST;
            request.data = soundBytes;
            request.contentType = mimeType + "; rate=" + (1000 * sampleRate);

            loader.addEventListener(Event.COMPLETE, onComplete);
            loader.addEventListener(IOErrorEvent.IO_ERROR, onError);
            loader.load(request);

            trace("Connecting to Speech API server");
        }

        private function onError(event:IOErrorEvent):void{
            trace("Error: " + event.toString());
        }

        private function onComplete(event:Event):void{
            trace("Done: " + event.target.data);
        }

        private function record(event:Event):void{
            soundBytes = new ByteArray();
            soundBytes.endian = Endian.BIG_ENDIAN;

            microphone.addEventListener(SampleDataEvent.SAMPLE_DATA, sampleData);
        }

        private function sampleData(event:SampleDataEvent):void {               
            soundBytes.writeBytes(event.data, 0, event.data.bytesAvailable);
        }

        private function stop(e:Event):void {
            microphone.removeEventListener(SampleDataEvent.SAMPLE_DATA, sampleData);

            if(soundBytes != null){
                process();
                send();
            }
        }       
    ]]>
</fx:Script>

<s:HGroup>
    <s:Button label="Record"
              click="record(event)"/>
    <s:Button label="Stop and Send"
              click="stop(event)"/>
</s:HGroup>
</s:Application>

For more info check this links: http://mikepultz.com/2011/03/accessing-google-speech-api-chrome-11/ and Does Chrome have built-in speech recognition for "x-webkit-speech" input elements?


The code you are looking for is at http://src.chromium.org/viewvc/chrome/trunk/src/content/browser/speech/speech_recognizer.cc?view=diff&r1=79556&r2=79557 around lines 100-160 which in turn #includes .../viewvc/chrome/trunk/deps/third_party/speex/

However, Chrome switched from Speex to FLAC at the end of March without any real explanation in the change log -- http://src.chromium.org/viewvc/chrome/trunk/src/content/browser/speech/speech_recognizer.cc?view=diff&r1=79556&r2=79557 -- so I would not advise using Speex. On the other hand, someone looked at the Android source and said they still use Speex there, so it's likely they will keep it (it's less than a fifth as many bytes per second.)

0

精彩评论

暂无评论...
验证码 换一张
取 消