开发者

Order a list of files by size via python

开发者 https://www.devze.com 2022-12-14 03:00 出处:网络
Example dump from the list of a director开发者_JAVA技巧y: hello:3.1 GB world:1.2 MB foo:956.2 KB The above list is in the format of FILE:VALUE UNIT. How would one go about ordering each line above

Example dump from the list of a director开发者_JAVA技巧y:

hello:3.1 GB
world:1.2 MB
foo:956.2 KB

The above list is in the format of FILE:VALUE UNIT. How would one go about ordering each line above according to file size?

I thought perhaps to parse each line for the unit via the pattern ":VALUE UNIT" (or somehow use the delimiter) then run it through the ConvertAll engine, receive the size off each value in bytes, hash it with the rest of the line (filenames), then order the resulting dictionary pairs via size.

Trouble is, I have no idea about pattern matching. But I see that you can sort a dictionary

If there is a better direction in which to solve this problem, please let me know.


EDIT:

The list that I had was actually in a file. Taking inspiration from answer of the (awesome) Alex Martelli, I've written up the following code that extracts from one file, orders it and writes to another.

#!/usr/bin/env python

sourceFile = open("SOURCE_FILE_HERE", "r")
allLines = sourceFile.readlines()
sourceFile.close()

print "Reading the entire file into a list."

cleanLines = []

for line in allLines:
    cleanLines.append(line.rstrip())

mult = dict(KB=2**10, MB=2**20, GB=2**30)

def getsize(aline):
  fn, size = aline.split(':', 1)
  value, unit = size.split(' ')
  multiplier = mult[unit]
  return float(value) * multiplier

print "Writing sorted list to file."

cleanLines.sort(key=getsize)

writeLines = open("WRITE_OUT_FILE_HERE",'a')

for line in cleanLines:
    writeLines.write(line+"\n")

writeLines.close()


thelines = ['hello:3.1 GB', 'world:1.2 MB', 'foo:956.2 KB']

mult = dict(KB=2**10, MB=2**20, GB=2**30)

def getsize(aline):
  fn, size = aline.split(':', 1)
  value, unit = size.split(' ')
  multiplier = mult[unit]
  return float(value) * multiplier

thelines.sort(key=getsize)
print thelines

emits ['foo:956.2 KB', 'world:1.2 MB', 'hello:3.1 GB'], as desired. You may have to add some entries to mult if KB, MB and GB don't exhaust your set of units of interest of course.

0

精彩评论

暂无评论...
验证码 换一张
取 消