开发者

XPath performance & versions

开发者 https://www.devze.com 2023-03-15 01:04 出处:网络
I have 3 questions: 1) Is XPath string \"//table[position()=8 or position()=1开发者_如何学运维0]/td[1]/span[2]/text()\" faster than the XPath string \"//table[8]/td[1]/span[2]/text() | //table[10]/td

I have 3 questions:

1) Is XPath string "//table[position()=8 or position()=1开发者_如何学运维0]/td[1]/span[2]/text()" faster than the XPath string "//table[8]/td[1]/span[2]/text() | //table[10]/td[1]/span[2]/text()"?

I use XPath with .NET CSharp and HTMLAgilityPack.

2) How can I determine what version of XPath I use. If I use XPath 1.0, how to upgrade to XPath 2.0?

3) Is there a performance optmimization and improvement into XPath 2.0 or just new features and new syntax?


XPath 2.0 expands significantly on XPath 1.0 (read here for a summary), though you don't need to switch unless you would benefit from the new functionality.

As for which one would be faster I believe the first one would be faster because you're repeating the node search in the second case. The first case is also more readable, and in general you want to go with the more readable one anyways.


As to the performance question, I'm afraid I don't know. It depends on the optimizer in the particular XPath processor you are using. If it's important to you, measure it. If it's not important enough to measure, then it's not important enough to worry about.

As I mentioned in my previous reply, //table[8] smells wrong to me. I think it's much more likely that you want (//table)[8]. (Both are valid XPath expressions, but they produce different answers).

You can probably assume that a processor is XPath 1.0 unless it says otherwise - if it supports 2.0, they'll want you to know. But you can easily test, for example by seeing what happens when you do //a except //b.

There's no intrinsic reason why an XPath 2.0 processor should be faster than a 1.0 processor on the same queries. In fact, it might be a bit slower, because it's required to do more careful type-checking. On the other hand it might be a lot faster, because many 1.0 processors were dashed off very quickly and never upgraded. But there are massive improvements in functionality in 2.0, for example regular expression support.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号