开发者

In Python, use "dict" with keywords or anonymous dictionaries? [closed]

开发者 https://www.devze.com 2022-12-14 14:34 出处:网络
Closed. This question is opinion-based. It is not currently accepting answers. Want to improve this question? Update the question so it can be answered with facts a开发者_StackOverflownd c
Closed. This question is opinion-based. It is not currently accepting answers.

Want to improve this question? Update the question so it can be answered with facts a开发者_StackOverflownd citations by editing this post.

Closed last year.

Improve this question

Say you want to pass a dictionary of values to a function, or otherwise want to work with a short-lived dictionary that won't be reused. There are two easy ways to do this:

Use the dict() function to create a dictionary:

foo.update(dict(bar=42, baz='qux'))

Use an anonymous dictionary:

foo.update({'bar': 42, 'baz': 'qux'})

Which do you prefer? Are there reasons other than personal style for choosing one over the other?


I prefer the anonymous dict option.

I don't like the dict() option for the same reason I don't like:

 i = int("1")

With the dict() option you're needlessly calling a function which is adding overhead you don't need:

>>> from timeit import Timer
>>> Timer("mydict = {'a' : 1, 'b' : 2, 'c' : 'three'}").timeit()
0.91826782454194589
>>> Timer("mydict = dict(a=1, b=2, c='three')").timeit()
1.9494664824719337


I think in this specific case I'd probably prefer this:

foo.update(bar=42, baz='qux')

In the more general case, I often prefer the literal syntax (what you call an anonymous dictionary, though it's just as anonymous to use {} as it is to use dict()). I think that speaks more clearly to the maintenance programmer (often me), partly because it stands out so nicely with syntax-highlighting text editors. It also ensures that when I have to add a key which is not representable as a Python name, like something with spaces, then I don't have to go and rewrite the whole line.


My answer will largely talk about the design of APIs to use dicts vs. keyword args. But it's also applicable the individual use of {...} vs. dict(...).

The bottom line: be consistent. If most of your code will refer to 'bar' as a string - keep it a string in {...}; if you normally refer to it the identifier bar - use dict(bar=...).

Constraints

Before talking about style, note that the keyword bar=42 syntax works only for strings and only if they are valid identifiers. If you need arbitrary punctuation, spaces, unicode - or even non-string keys - the question is over => only the {'bar': 42} syntax will work.

This also means that when designing an API, you must allow full dicts, and not only keyword arguments - unless you are sure that only strings, and only valid identifiers are allowed. (Technically, update(**{'spaces & punctuation': 42}) works. But it's ugly. And numbers/tuples/unicode won't work.)

Note that dict() and dict.update() combine both APIs: you can pass a single dict, you can pass keyword args, and you can even pass both (the later I think is undocumented). So if you want to be nice, allow both:

def update(self, *args, **kwargs):
    """Callable as dict() - with either a mapping or keyword args:

    .update(mapping)
    .update(**kwargs)
    """
    mapping = dict(*args, **kwargs)
    # do something with `mapping`...

This is especially recommended for a method named .update(), to follow the least-surprise rule.

Style

I find it nice to distinguish internal from external strings. By internal I mean arbitrary identifiers denoting something only inside the program (variable names, object attributes) or possibly between several programs (DB columns, XML attribute names). They are normally only visible to developers. External strings are intended for human consumption.

[Some Python coders (me included) observe the convention of using 'single_quotes' for internal strings vs. "Double quotes" for external strings. This is definitely not universal, though.]

Your question is about the proper uses of barewords (Perl term) - syntax sugars allowing to omit the quotes quotes altogether on internal strings. Some languages (notably LISP) allow them widely; the Pythonic opportunities to employ barewords are attribute access - foo.bar and keyword arguments - update(bar=...).

The stylistic dilemma here is "Are your strings internal enough to look like identifiers?"

If the keys are external strings, the answer is definitely NO:

foo.update({"The answer to the big question": 42})

# which you later might access as:
foo["The answer to the big question"]

If the keys refer to Python identifiers (e.g. object attributes), then I'd say YES:

foo.update(dict(bar=42))
# As others mentioned, in that case the cleaner API (if possible)
# would be to receive them as **kwargs directly:
foo.update(bar=42)

# which you later might access as:
foo.bar

If the keys refer to identifiers outside your Python program, such as XML attr names, or DB column names, using barewords may be good or bad choice - but you it's best to choose one style and be consistent.

Consistency is good because there is a psychological barrier between identifiers and strings. It exists because strings rarely cross it - only when using introspection to do meta-programming. And syntax highlighting only reinforces it. So if you read the code and see a green 'bar' in one place and a black foo.bar in a second place, you won't immediately make a connection.

Another important rule of thumb is: Barewords are good iff they are (mostly) fixed. E.g. if you refer to fixed DB columns mostly in your code, than using barewords to refer to them might be nice; but if half the time the column is a parameter, then it's better to use strings.

This is because parameter/constant is the most important difference people associate with the identifiers/strings barrier. The difference between column (variable) and "person" (constant) is the most readable way to convey this difference. Making them both identifiers would blur the distinction, as well as backfiring syntactically - you'd need to use **{column: value} and getattr(obj, column) etc. a lot.


I prefer your "anonymous dictionary" method and I think this is purely a personal style thing. I just find the latter version more readable but it's also what I'm used to seeing.


The dict() method has the added overhead of a function call.

>>>import timeit,dis
>>> timeit.Timer("{'bar': 42, 'baz': 'qux'}").repeat()
[0.59602910425766709, 0.60173793037941437, 0.59139834811408321]
>>> timeit.Timer("dict(bar=42, baz='qux')").repeat()
[0.98166498814792646, 0.97745355904172015, 0.99231773870701545]

>>> dis.dis(compile("{'bar': 42, 'baz': 'qux'}","","exec"))
  1           0 BUILD_MAP                0
              3 DUP_TOP
              4 LOAD_CONST               0 (42)
              7 ROT_TWO
              8 LOAD_CONST               1 ('bar')
             11 STORE_SUBSCR
             12 DUP_TOP
             13 LOAD_CONST               2 ('qux')
             16 ROT_TWO
             17 LOAD_CONST               3 ('baz')
             20 STORE_SUBSCR
             21 POP_TOP
             22 LOAD_CONST               4 (None)
             25 RETURN_VALUE

>>> dis.dis(compile("dict(bar=42, baz='qux')","","exec"))
  1           0 LOAD_NAME                0 (dict)
              3 LOAD_CONST               0 ('bar')
              6 LOAD_CONST               1 (42)
              9 LOAD_CONST               2 ('baz')
             12 LOAD_CONST               3 ('qux')
             15 CALL_FUNCTION          512
             18 POP_TOP
             19 LOAD_CONST               4 (None)
             22 RETURN_VALUE


I prefer the anonymous dictionary, too, just out of personal style.


If I have a lot of arguments, sometimes it is nice to omit the quotes on the keys:

DoSomething(dict(
   Name = 'Joe',
   Age = 20,
   Gender = 'Male',
   ))

This is a very subjective question, BTW. :)


I think the dict() function is really there for when you're creating a dict from something else, maybe something that easily produces the necessary keyword args. The anonymous method is best for 'dict literals' in the same way you'd use "" for strings, not str().


Actually, if the receiving function will only receive a dictionary with not pre-dertermined keywords, I normally use the ** passing convention.

In this example, that would be:

class Foo(object):
   def update(self, **param_dict):
       for key in param_dict:
          ....
foo = Foo()
....
foo.update(bar=42, baz='qux')
0

精彩评论

暂无评论...
验证码 换一张
取 消