开发者

Automate speech input recording in Chrome

开发者 https://www.devze.com 2023-04-12 21:57 出处:网络
I\'m trying to automate the recording of speech in Google\'s speech input (only works in Chrome). As it is, the user has to click the mic to start the recording but I\'m working on an installation wh

I'm trying to automate the recording of speech in Google's speech input (only works in Chrome).

As it is, the user has to click the mic to start the recording but I'm working on an installation where the user won't interact with the computer. Thus I have to trigger the recording some other way.

As far as it seems you can't access the speech input functionality by code, i.e. you can't call a function to start recording. So now I'm looking at simulating mouse click on the mic.

I've tried using javaScript but it seems only events and event handlers are affected (e.g. a simulated click on an input field would fire its click handlers but wouldn't give focus to the field.)

So now I'm looking at simulating Windows system mouse clic开发者_开发知识库ks, and I've found some programs that can do that (mostly on intervals) and it works, the recording starts. But the problem is that I have to activate the click simulation from the browser application.

By best bet so far is AutoHotkey that enables you to create custom scripts, in my case a script that simulates a mouse click at a given position. So, if I could execute this script from the browser I would be safe, but I don't no how to do that.

Any ideas and / or thoughts are welcome!


I m facing a similar problem. We wanted to start and stop the recording, to test how good google api works with voice recognition in german. But no solution found yet.

The html5 function is still limited and only works on five input fields. Maybe you find some information here: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Feb/att-0020/api-draft.html We come back to the topic tomorrow.


I had faced the similar problem, then I took a look at this site by Mike Pultz -

http://mikepultz.com/2011/03/accessing-google-speech-api-chrome-11/

Basically, what he does is he creates the audio file himself, uses Sox to convert it into a .flac format and then sends it to the Google Voice Api. So you do not need to click the mike or rather you can create your own mike call back.

I also have created a C# solution at - https://github.com/seigneur/Voice-Biometrics And you can look at this video for further help - http://www.youtube.com/watch?v=PA00SPOTL-M

Hope it helps

0

精彩评论

暂无评论...
验证码 换一张
取 消