Speech recognition language model_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2023-01-20 03:15 出处：网络

I would like to integrate speech recognition into my Android application. I am aware google provides two language models (free form for dictation and web search for short phrases).

I would like to integrate speech recognition into my Android application.

I am aware google provides two language models (free form for dictation and web search for short phrases).

However, my app will have a finite number of possible words (maybe a few th开发者_运维知识库ousand). Is it possible to specify the vocabularly; limiting it to these words, in the hope of achieving more accurate results?

My immediate thoughts would be to use the web search language model and then check the results of this against my vocabulary.

Any thoughts appreciated.

I think your intuition is correct and you've answered your own question.

The built in speech recognition provided by google only supports the dictation and search language models. See http://developer.android.com/reference/android/speech/RecognizerIntent.html

You can get back results using these recognizer models and then classify or filter the results to find what best matches your limited vocabulary. There are different techniques to do this and they can range from simple parsing to complex statistical models.

The only other alternative I've seen is to use some other speech recognition on a server that can accept your dedicated language model. Though this is costly and complex and used by commercial speech companies like VLingo or Dragon or Microsoft's Bing.

You can use Opensource models like Voxforge or cheap ones like Lumenvox. Some have been ported to android. I forgot by whom.

I answered pretty much the same question before - please check here: Building openears compatible language model

and here:

typically you need very large text corpora to generate useful language models.

If you just have a small amount of training data, your language model will be over-fitted, which means that it will not generalize.