I have a question about parsing XML Document with XPATH in Ruby.
A small part of my XML :
...
<Row>
<Cell ss:StyleID="s248"><Data ss:Type="String">Picardie</Data></Cell>
<Cell ss:Index="4" ss:StyleID="s28"/>
</Row>
<Row>
<Cell ss:StyleID="s249"><Data ss:Type="String"><Font html:Color="#0000D4"
xmlns="http://www.w3.org/TR/REC-html40">80 Somme</Font></Data></Cell>
<Cell ss:Index="3" ss:StyleID="s30"/>
<Cell ss:StyleID="s28"/>
</Row>
<Row>
<Cell ss:StyleID="s250"><Data ss:Type="String"><Font html:Color="#DD0806"
xmlns="http://www.w3.org/TR/REC-html40">André</Font></Data></Cell>
<Cell ss:Index="3" ss:StyleID="s30"/>
<Cell ss:StyleID="s28"/>
</Row>
<Row>
<Cell ss:StyleID="s36"><Data ss:Type=开发者_开发技巧"String">23, rue des Lingers </Data></Cell>
<Cell ss:StyleID="s36"><Data ss:Type="String">80100 ABBEVILLE</Data></Cell>
<Cell ss:StyleID="s38"><Data ss:Type="String">'</Data></Cell>
</Row>
<Row ss:StyleID="s82">
<Cell ss:StyleID="s49"><Data ss:Type="String">32, rue des Trois Cailloux</Data></Cell>
<Cell ss:StyleID="s49"><Data ss:Type="String">80000 AMIENS</Data></Cell>
<Cell ss:StyleID="s48"><Data ss:Type="String">03.22.22.01.66</Data></Cell>
<Cell ss:StyleID="s85"/>
</Row>
...
Desired ouput :
...
'Picardie' '80 Somme' 'André' '23, rue des Lingers' '80100 ABBEVILLE'
'Picardie' '80 Somme' 'André' '32, rue des Trois Cailloux' '80000 AMIENS' '03.22.22.01.66'
...
Do you have an idea ?
Nokogiri is quite a standard tool for this job:
http://nokogiri.org/
Here's an example from the docs:
# Search for nodes by xpath
doc.xpath('//h3/a[@class="l"]').each do |link|
puts link.content
end
Sorry, I'm at work, so I don't have the time to give you a snippet specific to your problem, but I'm sure you can figure it out from the docs and the short example :-)
This Xpath //set//*[not(descendant::*)]/text()
will give you the list of text elements in a set of rows. You have to replace "set" by the name of your parent node of the rows.
精彩评论