开发者

How can I get title & scripts inside a webpage using webkit + gtk?

开发者 https://www.devze.com 2023-01-22 02:57 出处:网络
Here \'s my code snippet import gtk, webkit window = gtk.Window() browser = webkit.WebView() url = \"www.googl开发者_运维技巧e.com\"

Here 's my code snippet

import gtk, webkit
window = gtk.Window()
browser = webkit.WebView()
url = "www.googl开发者_运维技巧e.com"
browser.open(url)

Now I wanna get the web page title, script tags inside. So how can I do that ?

The documentation is not clear at these points and I only found documentation for Objective-C and I am trying to find my way there. Please if you know where can I get a better reference not necessarily for Python. C, C++ would be fine also.

Thanks


I think the following should work (I can't try it out right now):

def title_changed(widget, frame, title):
    print title

browser.connect('title-changed', title_changed)

There is some documentation here and here and two examples in the demo directory from the source tarball.


It is not bound to the technology used to retrieve the html. Once browser has opened it, just parse the html with beautiful soup or anything that supports XPath for example.

0

精彩评论

暂无评论...
验证码 换一张
取 消