开发者

Detecting blowing on a microphone with GStreamer (or another library) [closed]

开发者 https://www.devze.com 2023-02-25 19:06 出处:网络
Closed. This question needs details or clarity. It is not currently accepting answers. Want to improve this question? Add details and clarify the problem by editing this post.
Closed. This question needs details or clarity. It is not currently accepting answers.

Want to improve this question? Add details and clarify the problem by editing this post.

Closed 9 years ago.

Improve this question

Can I detect blowing on a microphone with GStreamer (or another Linux-compatible sound library)?

I can get some informations about the sound doing that:

import gtk, gst

def playerbinMessage(bus, message):
    if message.type == gst.MESSAGE_ELEMENT:
        struct = message.structure

        if struct.get_name() == 'level':
            # printing peak, decay, rms
            print struct['peak'][0], struct['decay'][0], struct['rms'][0]

pipeline = gst.parse_launch('pulsesrc ! level ! filesink location=/dev/null')

bus = pipeline.get_bus()
bus.add_signal_watch()
bus.connect('message', playerbinMessage)

pipeline.set_state(gst.STATE_PLAYING)

gtk.main()

I use this to detect clapping, but I don't know if I can use these informations to dete开发者_高级运维ct blowing without my computer confuses blowing and talking. Also, I don't know if there's another way to analyse sound with GStreamer or another Linux-compatible sound library.


You need to look at more than the audio level to distinguish between blowing and speech. For a start, consider that most speech consists of audio frequencies higher than about 80Hz, while blowing on the mic produces lots of low-frequency rumble.

So: if you want to stick to using gstreamer, maybe try using the "audiocheblimit" filter to low-pass the sound before measuring its level. (Something like audiocheblimit mode=low-pass cutoff=40 poles=4)

Personally, my approach would be more like:

  1. record the raw audio with something like python-alsaaudio
  2. compute the fourier transform of sound chunks using numpy
  3. sum up the amplitudes of low frequencies (20-40Hz, maybe) and trigger if this value is large enough.

If that didn't work, then I'd look for more clever detection algorithms. This approach (alsa+numpy) is very flexible, but a bit more complicated than the gstreamer approach.

edit: I just noticed gstreamer also has a "spectrum" element that will return the fourier transform.


Just a mix of answer and op code ( sample pipe )

#!/usr/bin/env python

import pygtk
pygtk.require('2.0')
import gtk, gst, time

class HelloWorld:

  def delete_event(self, widget, event, data=None):
      print "delete event occurred"
      return False

  def destroy(self, widget, data=None):
      print "destroy signal occurred"
      gtk.main_quit()

  def __init__(self):
      self.window = gtk.Window(gtk.WINDOW_TOPLEVEL)
      self.window.connect("delete_event", self.delete_event)
      self.window.connect("destroy", self.destroy)
      self.window.set_border_width(2)
      #self.window.set_size_request(600, 483)

      """ Play """
      self.vbox = gtk.VBox(False, 2)
      self.vbox.set_border_width(0)

      self.hbox = gtk.HBox()
      self.hlpass = gtk.Entry()
      self.hlpass.set_text("low-pass")
      self.hbox.pack_start( gtk.Label("High/Low-pass: "), False, False, 0 )
      self.hbox.pack_start( self.hlpass, False, False, 0 )
      self.vbox.add(self.hbox)

      self.hbox = gtk.HBox()
      self.cutoff = gtk.Entry()
      self.cutoff.set_text("40")
      self.hbox.pack_start( gtk.Label("Cutoff: "), False, False, 0 )
      self.hbox.pack_start( self.cutoff, False, False, 0 )
      self.vbox.add(self.hbox)

      self.hbox = gtk.HBox()
      self.poles = gtk.Entry()
      self.poles.set_text("4")
      self.hbox.pack_start( gtk.Label("Poles: "), False, False, 0 )
      self.hbox.pack_start( self.poles, False, False, 0 )
      self.vbox.add(self.hbox)

      self.hbox = gtk.HBox()
      self.button = gtk.Button("High-Pass")
      self.button.connect("clicked", self.change, None)
      self.hbox.pack_start(self.button, False, False, 0 )
      self.vbox.add(self.hbox)

      self.window.add(self.vbox)
      self.window.show_all()

  def main(self):
      self.gst()
      gtk.main()

  def gst(self):
      test = """
      alsasrc device=hw:0 ! audioconvert ! audioresample ! audiocheblimit mode=low-pass cutoff=40 poles=4 name=tuneit ! level ! autoaudiosink
      """
      self.pipeline = gst.parse_launch(test)
      self.bus = self.pipeline.get_bus()
      self.bus.add_signal_watch()
      self.bus.connect('message', self.playerbinMessage)
      self.pipeline.set_state(gst.STATE_PLAYING)

  def playerbinMessage(self,bus, message):
    if message.type == gst.MESSAGE_ELEMENT:
      struct = message.structure
      if struct.get_name() == 'level':
        print struct['peak'][0], struct['decay'][0], struct['rms'][0]
        #time.sleep(1)

  def change(self, widget, data=None):
    data = [self.hlpass.get_text(), self.cutoff.get_text(), self.poles.get_text()]
    print data[0], data[1], data[2]
    self.audiocheblimit = self.pipeline.get_by_name('tuneit')
    self.audiocheblimit.props.mode = data[0]
    self.audiocheblimit.props.cutoff = int( data[1] )
    self.audiocheblimit.props.poles = int ( data[2] )

if __name__ == "__main__":
    hello = HelloWorld()
    hello.main()

Output low-pass:

-20.9227157774 -20.9227157774 -20.953279177
-20.9366239523 -20.9227157774 -20.9591815321
-20.9290995367 -20.9227157774 -20.9601319723

Output high-pass:

-51.2328030138 -42.8335117509 -62.2730163502
-51.3932079772 -43.3559607159 -62.2080540769
-52.1412276733 -43.8784096809 -62.9151309943

EDIT:

high-pass = speech and taking all audio
low-pass  = some audio like when you are talking near the microphone


The CMU Sphinx project http://cmusphinx.sourceforge.net/ is a toolkit for speech recognition and it can use gstreamer to provide a microphone stream. You can have a look.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号