开发者

parsing out the html doctype tag in Nokogiri

开发者 https://www.devze.com 2023-02-24 06:41 出处:网络
How can I parse out the doctype tag to get the html version from a html file? Trying to use doctype(or DOCTYPE or !DOCTYPE) 开发者_开发技巧as an argument in xpath raises an invalide expression error.

How can I parse out the doctype tag to get the html version from a html file?

Trying to use doctype(or DOCTYPE or !DOCTYPE) 开发者_开发技巧as an argument in xpath raises an invalide expression error.


The doctype is not part of the document, but part of its DTD

require 'rubygems'
require 'nokogiri'

html = <<EOF
<!DOCTYPE foo PUBLIC "bar" "qux">
<html>
</html>
EOF

doc = Nokogiri::HTML(html)

puts doc.internal_subset.name
puts doc.internal_subset.external_id
puts doc.internal_subset.system_id
0

精彩评论

暂无评论...
验证码 换一张
取 消