开发者

Hpricot search all the tags under one specific namespace

开发者 https://www.devze.com 2023-03-31 08:55 出处:网络
For example I have the following code: <head> <meta http-equiv=\"content-type\" content=\"text/html; charset=utf-8\" />

For example I have the following code:

<head>
  <meta http-equiv="content-type" content="text/html; charset=utf-8" />
  <title><io:content part="title" /></title>
  <link rel="icon" href="/document/7e9f29e2-cdee-4f85-ba25-132fa867aa90/latest" type="image/x-icon" />
  <n1:content description="Standard CSS" uuid="d069071c-3534-4945-9fb6-2d7be35a165e" />
  <n1:term>Content Development</n1:term>
</head>

This XHTML snippet is not strictly legal because there is no namespace declared before so I cannot use Nokogiri which has better namespace support.

I开发者_如何学编程 want to do a single search that can find both the node <n1:content> and <n1:term> and all the tags under 'n1' namespace.

How to achieve that? Thanks!


It looks like Hpricot does not handle namespaces that fully.

You can select if you know the element regardless of prefix:

doc.search("title")
=> #<Hpricot::Elements[{elem <title> {emptyelem <io:content part="title">} </title>}]>

... but this is not what you asked.

Here's my hack workaround: find all namespace elements using regex first, then search for those using Hpricot:

elems = doc.to_s.scan(/<\s*(n1:\w+)/).uniq.join("|")
=> "n1:content|n1:term"
doc.search(elems)
=> #<Hpricot::Elements[{emptyelem <n1:content description="Standard CSS" uuid="d069071c-3534-4945-9fb6-2d7be35a165e">}, {elem <n1:term> "Content Development" </n1:term>}]>
0

精彩评论

暂无评论...
验证码 换一张
取 消