I have a really ugly loop in my code which is really slowing down my program. The loop basically performs a dictionary comparison where, if a specific key in dict_A
is the same as in dict_B
, then for all matches a sort is performed which is written to a file.
for k, v in A_dict.items():
for i, value in B_dict.items():
if k == value[0]:
sorted_B = [list(value) for key, value in groupby(sorted(B_dict.values()), key=itemgetter(1,2))]
outfile.write('{0}\t{1}\t{2}\t{3}\t{4}\t{5}\n'.format (i, k, v, value[1], value[2], value[3])
Unfortunately, the dictionaries both contain over a million items. Other than putting this da开发者_如何转开发ta into a database then sorting, does anyone have any suggestions on how to speed up this loop? Thanks for the help.
Your example code may be inaccurate, but as written,
sorted_B = [list(value) for key, value in
groupby(sorted(B_dict.values()), key=itemgetter(2,3))]
will be the same every time... why is it in a loop at all?
Also
for k, v in A_dict.items():
for i, value in B_dict.items():
if k == value[0]:
outfile.write('{0}\t{1}\t{2}\t{3}\t{4}\t{5}\n'.format(
i, k, v, value[1], value[2], value[3])
Looks like it could just be written as
for i, value in B_dict.items():
k = value[0]
if k in A_dict:
outfile.write('{0}\t{1}\t{2}\t{3}\t{4}\t{5}\n'.format(
i, k, A_dict[k], value[1], value[2], value[3])
Which should be faster -- it's linear time rather than quadratic time.
精彩评论