开发者

Get (text) in XPath

开发者 https://www.devze.com 2023-02-20 08:40 出处:网络
开发者_如何学JAVAI have the following DOM structure / HTML, I want to get (just practicing...) the marked data.
开发者_如何学JAVA

I have the following DOM structure / HTML, I want to get (just practicing...) the marked data.

Get (text) in XPath

The one that is under the h2 element. that div[@class="coordsAgence"] element, has some more div children below and some more h2's.. so doing:

div[@class="coordsAgence"]

Will get that value, but with additional unneeded text. UPDATE: The value (From this example) that I basically want is that: "GALLIER Dennis" text.


It seems you want the first text node in that div:

div[@class="coordsAgence"]/text()[1]

should do it.

Note that this assumes that there is actually no whitespace between those comments inside <div class="coordsAgence">; otherwise that whitespace will constitute additional text nodes that you'll have to account for.


Get the first text node following the first h2 in the div with class "coordsAgence":

div[@class='coordsAgence']/h2[1]/following-sibling::text()[1]

Note that this first expression returns the first text node after the first h2 even when some other node appears between the two. If you want to return the text only when it's the node that immediately follows the first h2, then try something like this:

div[@class='coordsAgence']/h2[1][following-sibling::node()[1][self::text()]]/following-sibling::text()[1]


using Python/Scrapy to get text from h1 tag(for example):

response.xpath(
        "//div[contains(@class, 'class_name')]//h1[contains(@class, 'class_name')]/text()"
    ).get()
0

精彩评论

暂无评论...
验证码 换一张
取 消