开发者

Converting HTML to RDF

开发者 https://www.devze.com 2022-12-21 10:32 出处:网络
I\'m looking for a general purpose API/web service/tool/etc... t开发者_如何学Gohat allows convert a given HTML page to an RDF graph as specific as possible (most probably using a back bone ontology an

I'm looking for a general purpose API/web service/tool/etc... t开发者_如何学Gohat allows convert a given HTML page to an RDF graph as specific as possible (most probably using a back bone ontology and/or mapper).


Have you proved GRDDL?

GRDDL is a technique for obtaining RDF data from XML documents and in particular XHTML pages.


I used XQuery to extract the data out of the given set of web pages. I had to write custom queries for the web pages. I think this is the most straight forward approach to take for a specific set of HTML files. However, it is obviously not good for the general case. For a different set of web pages other custom queries are need to be written.


I used JSoup to scrape data from HTML. It uses jQuery style of querying HTML DOM, wich I was already famirial with, so it was realy simple tool to use for me. I also fund it quite robust but I needed it just to scrape 3 datasources so I dont have rich experience with this tool yet. jsoup

0

精彩评论

暂无评论...
验证码 换一张
取 消