开发者

Python sorting problem [duplicate]

开发者 https://www.devze.com 2023-01-03 13:04 出处:网络
This question already 开发者_开发知识库has answers here: Closed 12 years ago. Possible Duplicate:
This question already 开发者_开发知识库has answers here: Closed 12 years ago.

Possible Duplicate:

Python analog of natsort function (sort a list using a “natural order” algorithm)

I'm sure this is simple but I can't figure it out. I have a list of strings like this(after using sorted on it):

Season 2, Episode 1: A Flight to Remember
Season 2, Episode 20: Anthology of Interest I
Season 2, Episode 2: Mars University
Season 2, Episode 3: When Aliens Attack
....
Season 3, Episode 10: The Luck of the Fryrish
Season 3, Episode 11: The Cyber House Rules
Season 3, Episode 12: Insane in the Mainframe   
Season 3, Episode 1: The Honking
Season 3, Episode 2: War Is the H-Word

How can I make them sort out properly? (by season then episode #, ascending)


Use the key parameter to the sort function to specify the key you would like to use for sorting.

def get_sort_key(s):
    m = re.match('Season ([0-9]+), Episode ([0-9]+): .*', s)
    return (int(m.group(1)), int(m.group(2)))

my_list.sort(key=get_sort_key)


There are two ways to approach this:

  1. Define your own sorting function cmp(x, y), where x and y are strings, and you return 1 if the second one is greater than the first, -1 if the first is greater, and 0 if they're the same. Then pass this function as the "cmp" argument to the built-in sort() function.

  2. Convert all of the strings into a format where the "natural" sorting order is exactly what you want. For example you could just zero-pad them like "Season 03, Episode 07". Then you can sort them using sort().

Either way, I'd suggest using a simple regular expression to get the season and episode out of the string, something like:

m = re.match('Season ([0-9]+), Episode ([0-9]+): .*', s)
(season, episode) = (int(m.group(1)), int(m.group(2)))


Since you're sorting by strings, "1" comes before "10", so your intended episodes will not be in proper order. The solution is to pull apart the string into its constituent parts, namely get the season and episodes as integers, place them in an associative data structure then sort by the relevant integers. For pulling apart the string into its parts, check out Python's Regular Expressions, cast the season number and episode numbers as integers, then pick a data structure you like and associate the integer keys with the strings. Sort by the keys, and you're done.

0

精彩评论

暂无评论...
验证码 换一张
取 消