Want to improve this question? 开发者_开发技巧Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this questionHow would I go about parsing HTML in C++ on my Webserver Application?
libxml2
has a HTML parser. libxml++
is a wrapper for libxml2
, but I'm not sure if it exposes the HTMLparser
functionality.
It will mainly depend on what you want to do retrieve in your webpage. You can try boost::spirit to create your own parser. (Or a Yacc/Lex parser).
If your are looking for more simple information in the HTML page, getc may be good enough...
Hand parsing gets messy, even for relatively trivial cases.
Have you considered a Lexer/Parser, such as Flex/Bison? I highly recommend Antlr - and get AntlrWorks.
A picture is worth a thousand words, so this will tell you why - http://www.antlr.org/works/screenshots/editor.jpg
精彩评论