开发者

Append JSON elements to a list, then remove duplicates efficiently in Python

开发者 https://www.devze.com 2023-02-03 04:01 出处:网络
I have a json file that\'s like for e.g. [{\"fu\": \"thejimjams\", \"su\": 232104580}, {\"fu\": \"thejimjams\", \"su\":

I have a json file that's like for e.g.

[{"fu": "thejimjams", "su": 232104580}, {"fu": "thejimjams", "su": 216575430}, {"fu": "thejimjams", "su": 184695850}]

I need to put all the values 开发者_如何学JAVAfor a bunch of json files in the "su" category in a list. So each file (about 200) will have their own list, then I'm going to combine the list and remove duplicates. Is there and advisable while I go about doing this to save system resources and time?

I'm thinking of making a list, loop through the json file get each "su" put it on a list go to the next file then append list, then scan through to remove duplicates.

In terms of removing the duplicates I'm thinking of following what the answer was on this question: Combining two lists and removing duplicates, without removing duplicates in original list unless that's not efficient

Basically open to recommendations about a good way to implement this.

Thanks,


Do you care about order? If not you can add the numbers to a set() which will automatically remove duplicates. For example, if you have 200 "su" lists:

lists = [
    [...su's for file 1...],
    [...su's for file 2...],
    etc.
]

Then you can combine them into one big set with:

set(su for sus in lists for su in sus)


Very straight forward way would be:

json_list = [{"fu": "thejimjams", "su": 232104580}, {"fu": "thejimjams", "su": 216575430}, {"fu": "thejimjams", "su": 184695850}]

new_list = []
for item in json_list:
    if item not in new_list:
        new_list.append(item)


Use a python set which is designed to keep a unique list of elements. That will remove duplicates as you add elements.

output = set()
for filename in filenames:
    data = json.loads(open(filename, 'r').read())
    for row in data:
        output.add(row.get('su'))

# convert back to a list
output = list(output)
0

精彩评论

暂无评论...
验证码 换一张
取 消