开发者

if __ and __ in ___ then

开发者 https://www.devze.com 2023-02-24 09:46 出处:网络
I am trying to create a script that loops through a list. I need to look through a finite list (400) of competency identifiers (e.g. 124, 129 etc - normal ints )

I am trying to create a script that loops through a list.

I need to look through a finite list (400) of competency identifiers (e.g. 124, 129 etc - normal ints )

I then have a dictionary that records what competencies each user has. The Key is the user name and the value for each key is a list of integers (i.e. which competencies the users have)

For example

User x - [124, 198, 2244 ...]
User Y - [129, 254, 198, 2244 ...]

I am looking to compile a matrix highlighting how often each competency occurs with every other competency - an adjacency matrix.

For example in the above examples competency 198 has occurred with competency 2244 on two occasions. Whereas competency 254 and 124 have never occurred together.

I am currently using this code:

fe = []    
count = 0
competency_matches = 0
for comp in competencies_list:
    common_competencies = str("")
for comp2 in competencies_list:
    matches = int(0)
    for person in listx:
        if comp and comp2 in d1[person]:
            matches = matches + 1
        else:
            matches = matches
    common_competencies = str(common_competencies) + str(matches) + ","
fe.append(common_competencies)
print fe
print count
cou开发者_如何转开发nt = count + 1

This doesnt work and simply returns how many times each competency has occurred overall. I think the problem is with the "if comp and comp2 in d1[person]:" line.

The problem would be, for example, if a person had the following competencies [123, 1299, 1236] and I searched for competency 123, this would be returned twice due to this appearing in the 123 and 1236 entries. Does a way exist to force an EXACT match when using the if __ and __ then operation.

Or does anyone have an improve suggestion how to achieve this ...

Thanks in advance for any pointers. Cheers


You're misinterpreting how and works. To test if two values are in a list, use:

if comp1 in d1[person] and comp2 in d1[person]:
  ...

Your version does something else. It binds like this: if (comp1) and (comp2 in d1[person]). In other words, it interprets comp1 as a truth value, and then does a boolean and with your list inclusion check. This is valid code, but it doesn't do what you want.


This should run quite a bit faster because it removes an extra layer of iteration. Hope it helps.

from collections import defaultdict
from itertools import combinations

def get_competencies():
    return {
        "User X": [124, 198, 2244],
        "User Y": [129, 254, 198, 2244]
    }

def get_adjacency_pairs(c):
    pairs = defaultdict(lambda: defaultdict(int))
    for items in c.itervalues():
        items = set(items)  # remove duplicates
        for a,b in combinations(items, 2):
            pairs[a][b] += 1
            pairs[b][a] += 1
    return pairs

def make_row(lst, fmt):
    return ''.join(fmt(i) for i in lst)

def make_table(p, fmt="{0:>8}".format, nothing=''):
    labels = list(p.iterkeys())
    labels.sort()

    return [
        make_row([""] + labels, fmt)
    ] + [
        make_row([a] + [p[a][b] if b in p[a] else nothing for b in labels], fmt)
        for a in labels
    ]

def main():
    c = get_competencies()
    p = get_adjacency_pairs(c)
    print('\n'.join(make_table(p)))

if __name__=="__main__":
    main()

results in

             124     129     198     254    2244
     124                       1               1
     129                       1       1       1
     198       1       1               1       2
     254               1       1               1
    2244       1       1       2       1        

... obviously a 400-column table is a bit much to print to screen; I suggest using csv.writer() to save it to a file which you can then work on in Excel or OpenOffice.


Your indentation here means that your two loops aren't nested. You first iterate through competencies_list and set common_competencies to the empty string 400 times, then iterate through competencies_list again and do what phooji explained. I'm pretty sure that's not what you want to do.

0

精彩评论

暂无评论...
验证码 换一张
取 消