I have a list in this format:
exon_start exon_finish gene_name (repeated hundreds of times)
I want to sort by exon_start
Example list:
['8342758', '8344137', 'NM_001042682']
['85420368', '85421471', 'NM_032184']
['86363115', '86364485', 'NM_152890']
['89820771', '89822936', 'NM_015350']
['904123', '905900', 'NR_027693']
['91176416', '91179454', 'NM_201269']
['92418409', '92420740', 'NM_015237']
['93575521', '93577419', 'NR_034089']
['94114411', '94116006', 'NM_014597']
['99926918', '99928016', 'NM_017734']
This list of lists (printed above) has already been sorted with the following code:
sorted_triplets = sorted(triplets, key=lambda x: x[0])
for i in sorted_triplets:
print i
However, "sorted" isn't working like I expect. As you can see from the list, 904123 is less th开发者_JS百科an 89820771. So it appears that "sorted" isn't comparing the numbers as a whole, rather as individual digits.
How do I fix this?
It's sorting them as strings, so the order is 'alphabetically'. Which is to say, it's going character by character and comparing, instead of comparing them as scalar values.
So do:
sorted_triplets = sorted(triplets, key=lambda x: int(x[0]))
And it should work.
convert strings to numbers
sorted(triplets, key=lambda x: int(x[0]))
Convert exon_start to an integer, strings are sorted lexicographically.
Right, because what you have are strings, not numbers. It will be sorted lexicographically. You may want to convert them to numbers (integers) first.
It seems like your "number" is actually a string. Convert this string to an integer ( int(string) ), then the sorting should work
Add int() call.
sorted_triplets = sorted(triplets, key=lambda x: int(x[0]))
精彩评论