I have a necessity to sort a given HTML table of the following structure, in Python.
<table>
<tr>
<td><a href="#">ABCD</a></td>
<td&开发者_C百科gt;A23BND</td>
<td><a title="ABCD">345345</td>
</tr>
<tr>
<td><a href="#">EFG</a></td>
<td>Add4D</td>
<td><a title="EFG">3432</td>
</tr>
<tr>
<td><a href="#">HG</a></td>
<td>GJJ778</td>
<td><a title="HG">2341333</td>
</tr>
</table>
I am doing something like this:
container = tree.findall("tr")
strOut = ""
data = []
for elem in container:
key = elem.findtext(colName)
data.append((key, elem))
data.sort()
The problem is that it sorts by the text inside the <td>
. I want to be able to sort by the anchor value and not href.
What can I do to achieve that? Thanks a lot.
It sorts by the text because that's what you're extracting as the key when you do
key = elem.findtext(colName)
I imagine colName
is some tag string, and findtext
will just find the text of the first subelement matching that tag. If what you want instead is to use as the key the value of some attribute (e.g. title
?) of an <a>
,
for ana in elem.findall('a'):
key = ana.get('title')
if key is not None: break
Would do that. Exactly what do you want to use as the key?
The sort
method has the valueable key
and cmp
arguments which you can use for custom sorting. If you augment the data
data structure with the extra information you need for sorting, you can use either key
or cmp
(depending on exact need) in the call to sort
to achieve what you want. Here's a simple example:
In [60]: ids = [1, 2, 3]
In [61]: score = {1: 20, 2: 70, 3: 40}
In [62]: ids.sort(key=lambda x: score[x])
In [63]: ids
Out[63]: [1, 3, 2]
Here, I sorted the ids
list according to the score of each id
taken from the score
dictionary.
I know this wasn't your question, but best practice for this sort of thing is to use Javascript. You will get a much better user experience on your website (if that's what you're doing).
This js library is excellent and easy to use: http://www.kryogenix.org/code/browser/sorttable/
精彩评论