Errors occur about dictionary function in Python_问答_开发者

Errors occur about dictionary function in Python

开发者 https://www.devze.com 2023-04-11 05:51 出处：网络

Part of my python script: (I first made the dictionary \"h\") def histogram(L): d= {} for x in L: if x in d:

相关专题：python

Part of my python script: (I first made the dictionary "h")

def histogram(L):
    d= {}
    for x in L:
       if x in d:
          d[x] +=1
       else:
          d[x] =1
    return d
h=histogram(LIST)

for vhfile in vhfiles:
    linel开发者_运维百科ist=commands.getoutput('cat ' + vhfile).splitlines(True)
    list1=[]
    for line in linelist:
        name1 = line.split()[0]
        if int(h[name1]) <= 300:
           list1.append(line)

Then I got error at "if" line:

File "/home/xug/scratch/mrfast/NA12878/dis_rm.py", line 51, in <module>
    if int(h[name1]) <= 300:
KeyError: '080821_HWI-EAS301_0002_30ALBAAXX:1:46:1643:1310'

Any idea what happened here? thx

You get a KeyError when you try to look something up in the dict, and the dict doesn't contain that key.

In this case, it appears that the key '080821_HWI-EAS301_0002_30ALBAAXX:1:46:1643:1310' does not occur in h.

A key error means that you've referenced a key in your dict that doesn't exist. There was an error retrieving the value at the key specified, because the key doesn't exist.

One way to deal with this is using a try/except block. If the code in the 'try' raises a 'KeyError', you know name1 wasn't in h, and you can do whatever is appropriate.

for line in linelist:
    name1 = line.split()[0]
    try:
        if int(h[name1]) <= 300:
           list1.append(line)
    except KeyError:
         <code here to deal with the condition>

This methodology of favoring exception handling over rampant use of 'if' checking is known in the Python community as 'EAFP' (Easier to Ask Forgiveness than Permission).

You can also (using less Pythonic means) check if name1 is in the list before trying to reference it:

if name1 in h:
    if int(h[name1]) <= 300:
       ... you get the idea

This methodology is called "Look Before You Leap" (LBYL). EAFP is generally preferable on the by and large.

As an aside, you shouldn't even need the histogram function at all. In Python 2.7, there's a Counter object that does this for you:

>>> LIST = "This is a sentence that will get split into multiple list elements. The list elements will get counted using defaultdict, so you don't need the histogram function at all.".split()    
>>> LIST
['This', 'is', 'a', 'sentence', 'that', 'will', 'get', 'split', 'into', 'multiple', 'list', 'elements.', 'The', 'list', 'elements', 'will', 'get', 'counted', 'using', 'defaultdict,', 'so', 'you', "don't", 'need', 'the', 'histogram', 'function', 'at', 'all.']    
>>> from collections import Counter    
>>> c = Counter(LIST)
>>> c
Counter({'get': 2, 'list': 2, 'will': 2, 'defaultdict,': 1, 'elements.': 1, "don't": 1, 'is': 1, 'at': 1, 'need': 1, 'sentence': 1, 'split': 1, 'you': 1, 'into': 1, 'function': 1, 'elements': 1, 'multiple': 1, 'that': 1, 'This': 1, 'histogram': 1, 'using': 1, 'The': 1, 'a': 1, 'all.': 1, 'so': 1, 'the': 1, 'counted': 1})

Pre-2.7, you can use defaultdict to get the same result:

>>> from collections import defaultdict
>>> dd = defaultdict(int)
>>> for word in LIST:
...     dd[word] += 1
... 
>>> dd
defaultdict(<type 'int'>, {'defaultdict,': 1, 'elements.': 1, "don't": 1, 'is': 1, 'at': 1, 'need': 1, 'sentence': 1, 'split': 1, 'get': 2, 'you': 1, 'into': 1, 'function': 1, 'elements': 1, 'multiple': 1, 'that': 1, 'This': 1, 'histogram': 1, 'using': 1, 'The': 1, 'a': 1, 'all.': 1, 'list': 2, 'will': 2, 'so': 1, 'the': 1, 'counted': 1})