开发者

Keyword Spotting in Speech [closed]

开发者 https://www.devze.com 2023-02-14 11:23 出处:网络
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.

We don’t allow 开发者_如何学Pythonquestions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.

Closed 6 years ago.

Improve this question

Is anyone aware of a Keyword Spotting System that is freely available, and possibly providing APIs ??

CMU Sphinx 4 and MS Speech API are speech recognition engines, and cannot be used for KWS.

SRI has a keyword spotting system, but no download links, not even for evaluation. (I even couldn't find anywhere a link to contact them for their software)

I found one here but it's a demo and limited.


CMUSphinx implements keyword spotting in pocketsphinx engine, see for details the FAQ entry.

To recognize a single keyphrase you can run decoder in “keyphrase search” mode.

From command line try:

pocketsphinx_continuous -infile file.wav -keyphrase “oh mighty computer” -kws_threshold 1e-20

From the code:

 ps_set_keyphrase(ps, "keyphrase_search", "oh mighty computer");
 ps_set_search(ps, "keyphrase_search);
 ps_start_utt();
 /* process data */

You can also find examples for Python and Android/Java in our sources. Python code looks like this, full example here:

# Process audio chunk by chunk. On keyphrase detected perform action and restart search
decoder = Decoder(config)
decoder.start_utt()
while True:
    buf = stream.read(1024)
    if buf:
         decoder.process_raw(buf, False, False)
    else:
         break
    if decoder.hyp() != None:
        print ([(seg.word, seg.prob, seg.start_frame, seg.end_frame) for seg in decoder.seg()])
        print ("Detected keyphrase, restarting search")
        decoder.end_utt()
        decoder.start_utt()

Threshold must be tuned for every keyphrase on a test data to get the right balance missed detections and false alarms. You can try values like 1e-5 to 1e-50.

For the best accuracy it is better to have keyphrase with 3-4 syllables. Too short phrases are easily confused.

You can also search for multiple keyphrase, create a file keyphrase.list like this:

  oh mighty computer /1e-40/
  hello world /1e-30/
  other_phrase /other_phrase_threshold/

And use it in decoder with -kws configuration option.

  pocketsphinx_continuous -inmic yes -kws keyphrase_list

This feature is not yet implemented in sphinx4 decoder.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号