If I needed facilitate the extraction of data from vari开发者_高级运维ous (non-API) internet sources, is there a framework-type solution that would streamline the process of having developers write reusable, yet source specific parsers on a large scale?
Pyparsing is a Python library that I've found to be very useful for parsing custom domain specific languages.
For *ML screen scraping, look no further than Beautiful Soup.
精彩评论