what's the quickest way to simple-merge files and what's the quickest way to split an array?_问答_开发者

what's the quickest way to simple-merge files and what's the quickest way to split an array?

开发者 https://www.devze.com 2023-01-19 02:28 出处：网络

what\'s the quickest way to take a list of files and a name of an output file and merge them into a single file while removing duplicate lines?

相关专题：python

what's the quickest way to take a list of files and a name of an output file and merge them into a single file while removing duplicate lines? something like

cat file1 file2 file3 | sort -u > out.file

in python.

prefer not to use system calls.

AND:

what's the quickest way to split a list in python into X chunks (list of lists) as equal as possible? (given a li开发者_开发百科st and X.)

First:

lines = set()
for filename in filenames:
    with open(filename) as inF:
        lines.update(inF)
with open(outfile, 'w') as outF:
    outF.write(''.join(lines))

Second:

def chunk(bigList, x):
    chunklen = len(bigList) / x
    for i in xrange(0, len(bigList), chunklen):
        yield bigList[i:i+chunklen]

listOfLists = list(chunk(bigList, x))

For the first:

lines = []
for filename in filenames:
    f = open(filename)
    lines.extend(f.read().split('\n')
    f.close()
lines = list(set(lines)) #remove duplicates
f = open(outfile_name, 'w')
f.write(''.join(lines))

assuming that the files are a reasonable length as all the data from the files will be stored in memory simultaneously. If you want to preserve the side effect of sort ordering the lines, then just add lines.sort() before the file is written.

And the second:

step_size = len(orig_list)/num_chunks
split_list = [orig_list[i:i+step_size] for i in range(0, len(orig_list), step_size)]

what's the quickest way to simple-merge files and what's the quickest way to split an array?

精彩评论

关注公众号

热门标签

图文推荐

what's the quickest way to simple-merge files and what's the quickest way to split an array?

更多 问答 相关资讯：

精彩评论

关注公众号

热门标签

图文推荐

更多问答相关资讯：