I'm trying to figure开发者_运维问答 out a method of storing a unique reference to each tag on a particular page. I won't have any ability to edit the page content and I'll the generated UID to stay the same on every page refresh.
Since browsers don't generate any kind of UID for elements, I was thinking that the only method to do this would be to execute a script which walks the DOM and creates a UID for each it comes across. I don't know how accurate this will be, especially considering I'll need to ensure it creates the same UID for the tag each time the script crawls the page.
Can anyone think of any other, more accurate ways of mapping a page?
Many thanks.
I need the exact same functionality. An idea I had was to look at the location of the tag, relative to a fixed element like BODY Tag and use the an XPATH like expression as the unique ID. So for example, if there is an HTML like
<BODY><TABLE><TD>
...etc, the unique id for TD could be /Body/Table/1...and so on. But this assumes that next time the page renders, there will not be more nodes that before. A slight improvement can be to use "ID" tags in the path whenver generated and not use them where not. For example, suppose the page is:
<BODY>
<DIV id="test">
<TABLE id="testtable">
<TR><TD></TD></TR>
.....
The unique id of the TD tag can be /Body/Div@test/Table@testtable/TD@0 etc.
If the page's content stays the same between refreshes, then the obvious way is to just generate the UID on the position of the element in DOM. It doesn't even need to be an XPath expression; a simple integer will do. However if the content can change between refreshes, the task becomes that much harder (if not impossible).
精彩评论