I'm trying to use the module HTML::Grabber to parse html in perl. It works when I just use it in my main process, but it throws me error when I attempt to use it with threading.
Specifically, I got this error,
Thread 1 terminated abnormally: Can't call method "parse_html_string"
on unblessed reference at /usr/local/ActivePerl-5.10/site/lib/HTML/Grabb开发者_开发百科er.pm line 79.
where the creation of Grabber object.
$mech->get($link);
$dom = HTML::Grabber->new(html => $mech->content); #at this point
Any idea how to fix this weird problem?
The parse_html_string
method is called on an XML::LibXML parser object.
XML::LibXML
seems to have mixed support for threads:
- http://search.cpan.org/~shlomif/XML-LibXML-1.78/LibXML.pod#THREAD_SUPPORT
What is probably happening is HTML::Grabber
is creating the parser object when it is imported by your script in the main thread. Then you create a child thread, and since XML::LibXML
does not clone between threads, the object disappears. You will need to do a runtime load of HTML::Grabber
with require
in the thread after it is spawned.
If that is not the case, you will have to boil down your problem to a small example and post the code here.
精彩评论