开发者

How to check if an object is a list or tuple (but not string)?

开发者 https://www.devze.com 2022-12-13 14:04 出处:网络
This is what I normally do in order to ascertain th开发者_如何学编程at the input is a list/tuple - but not a str. Because many times I stumbled upon bugs where a function passes a str object by mistak

This is what I normally do in order to ascertain th开发者_如何学编程at the input is a list/tuple - but not a str. Because many times I stumbled upon bugs where a function passes a str object by mistake, and the target function does for x in lst assuming that lst is actually a list or tuple.

assert isinstance(lst, (list, tuple))

My question is: is there a better way of achieving this?


In python 2 only (not python 3):

assert not isinstance(lst, basestring)

Is actually what you want, otherwise you'll miss out on a lot of things which act like lists, but aren't subclasses of list or tuple.


Remember that in Python we want to use "duck typing". So, anything that acts like a list can be treated as a list. So, don't check for the type of a list, just see if it acts like a list.

But strings act like a list too, and often that is not what we want. There are times when it is even a problem! So, check explicitly for a string, but then use duck typing.

Here is a function I wrote for fun. It is a special version of repr() that prints any sequence in angle brackets ('<', '>').

def srepr(arg):
    if isinstance(arg, basestring): # Python 3: isinstance(arg, str)
        return repr(arg)
    try:
        return '<' + ", ".join(srepr(x) for x in arg) + '>'
    except TypeError: # catch when for loop fails
        return repr(arg) # not a sequence so just return repr

This is clean and elegant, overall. But what's that isinstance() check doing there? That's kind of a hack. But it is essential.

This function calls itself recursively on anything that acts like a list. If we didn't handle the string specially, then it would be treated like a list, and split up one character at a time. But then the recursive call would try to treat each character as a list -- and it would work! Even a one-character string works as a list! The function would keep on calling itself recursively until stack overflow.

Functions like this one, that depend on each recursive call breaking down the work to be done, have to special-case strings--because you can't break down a string below the level of a one-character string, and even a one-character string acts like a list.

Note: the try/except is the cleanest way to express our intentions. But if this code were somehow time-critical, we might want to replace it with some sort of test to see if arg is a sequence. Rather than testing the type, we should probably test behaviors. If it has a .strip() method, it's a string, so don't consider it a sequence; otherwise, if it is indexable or iterable, it's a sequence:

def is_sequence(arg):
    return (not hasattr(arg, "strip") and
            hasattr(arg, "__getitem__") or
            hasattr(arg, "__iter__"))

def srepr(arg):
    if is_sequence(arg):
        return '<' + ", ".join(srepr(x) for x in arg) + '>'
    return repr(arg)

EDIT: I originally wrote the above with a check for __getslice__() but I noticed that in the collections module documentation, the interesting method is __getitem__(); this makes sense, that's how you index an object. That seems more fundamental than __getslice__() so I changed the above.


H = "Hello"

if type(H) is list or type(H) is tuple:
    ## Do Something.
else
    ## Do Something.


Python 3:

import collections.abc

if isinstance(obj, collections.abc.Sequence) and not isinstance(obj, str):
    print("`obj` is a sequence (list, tuple, etc) but not a string or a dictionary.")

Changed in version 3.3: Moved global namespace of "Collections Abstract Base Classes" from abc to collections.abc module. For backwards compatibility, they will continue to be visible in this module as well until version 3.8 where it will stop working.

Python 2:

import collections

if isinstance(obj, collections.Sequence) and not isinstance(obj, basestring):
    print "`obj` is a sequence (list, tuple, etc) but not a string or unicode or dictionary."


Python with PHP flavor:

def is_array(var):
    return isinstance(var, (list, tuple))


Generally speaking, the fact that a function which iterates over an object works on strings as well as tuples and lists is more feature than bug. You certainly can use isinstance or duck typing to check an argument, but why should you?

That sounds like a rhetorical question, but it isn't. The answer to "why should I check the argument's type?" is probably going to suggest a solution to the real problem, not the perceived problem. Why is it a bug when a string is passed to the function? Also: if it's a bug when a string is passed to this function, is it also a bug if some other non-list/tuple iterable is passed to it? Why, or why not?

I think that the most common answer to the question is likely to be that developers who write f("abc") are expecting the function to behave as though they'd written f(["abc"]). There are probably circumstances where it makes more sense to protect developers from themselves than it does to support the use case of iterating across the characters in a string. But I'd think long and hard about it first.


Try this for readability and best practices:

Python2 - isinstance()

import types
if isinstance(lst, types.ListType) or isinstance(lst, types.TupleType):
    # Do something

Python3 - isinstance()

import typing
if isinstance(lst, typing.List) or isinstance(lst, typing.Tuple):
    # Do something

Hope it helps.


This is not intended to directly answer the OP, but I wanted to share some related ideas.

I was very interested in @steveha answer above, which seemed to give an example where duck typing seems to break. On second thought, however, his example suggests that duck typing is hard to conform to, but it does not suggest that str deserves any special handling.

After all, a non-str type (e.g., a user-defined type that maintains some complicated recursive structures) may cause @steveha srepr function to cause an infinite recursion. While this is admittedly rather unlikely, we can't ignore this possibility. Therefore, rather than special-casing str in srepr, we should clarify what we want srepr to do when an infinite recursion results.

It may seem that one reasonable approach is to simply break the recursion in srepr the moment list(arg) == [arg]. This would, in fact, completely solve the problem with str, without any isinstance.

However, a really complicated recursive structure may cause an infinite loop where list(arg) == [arg] never happens. Therefore, while the above check is useful, it's not sufficient. We need something like a hard limit on the recursion depth.

My point is that if you plan to handle arbitrary argument types, handling str via duck typing is far, far easier than handling the more general types you may (theoretically) encounter. So if you feel the need to exclude str instances, you should instead demand that the argument is an instance of one of the few types that you explicitly specify.


I find such a function named is_sequence in tensorflow.

def is_sequence(seq):
  """Returns a true if its input is a collections.Sequence (except strings).
  Args:
    seq: an input sequence.
  Returns:
    True if the sequence is a not a string and is a collections.Sequence.
  """
  return (isinstance(seq, collections.Sequence)
and not isinstance(seq, six.string_types))

And I have verified that it meets your needs.


The str object doesn't have an __iter__ attribute

>>> hasattr('', '__iter__')
False 

so you can do a check

assert hasattr(x, '__iter__')

and this will also raise a nice AssertionError for any other non-iterable object too.

Edit: As Tim mentions in the comments, this will only work in python 2.x, not 3.x


I do this in my testcases.

def assertIsIterable(self, item):
    #add types here you don't want to mistake as iterables
    if isinstance(item, basestring): 
        raise AssertionError("type %s is not iterable" % type(item))

    #Fake an iteration.
    try:
        for x in item:
            break;
    except TypeError:
        raise AssertionError("type %s is not iterable" % type(item))

Untested on generators, I think you are left at the next 'yield' if passed in a generator, which may screw things up downstream. But then again, this is a 'unittest'


In "duck typing" manner, how about

try:
    lst = lst + []
except TypeError:
    #it's not a list

or

try:
    lst = lst + ()
except TypeError:
    #it's not a tuple

respectively. This avoids the isinstance / hasattr introspection stuff.

You could also check vice versa:

try:
    lst = lst + ''
except TypeError:
    #it's not (base)string

All variants do not actually change the content of the variable, but imply a reassignment. I'm unsure whether this might be undesirable under some circumstances.

Interestingly, with the "in place" assignment += no TypeError would be raised in any case if lst is a list (not a tuple). That's why the assignment is done this way. Maybe someone can shed light on why that is.


Another version of duck-typing to help distinguish string-like objects from other sequence-like objects.

The string representation of string-like objects is the string itself, so you can check if you get an equal object back from the str constructor:

# If a string was passed, convert it to a single-element sequence
if var == str(var):
    my_list = [var]

# All other iterables
else: 
    my_list = list(var)

This should work for all objects compatible with str and for all kinds of iterable objects.


simplest way... using any and isinstance

>>> console_routers = 'x'
>>> any([isinstance(console_routers, list), isinstance(console_routers, tuple)])
False
>>>
>>> console_routers = ('x',)
>>> any([isinstance(console_routers, list), isinstance(console_routers, tuple)])
True
>>> console_routers = list('x',)
>>> any([isinstance(console_routers, list), isinstance(console_routers, tuple)])
True


Python 3 has this:

from typing import List

def isit(value):
    return isinstance(value, List)

isit([1, 2, 3])  # True
isit("test")  # False
isit({"Hello": "Mars"})  # False
isit((1, 2))  # False

So to check for both Lists and Tuples, it would be:

from typing import List, Tuple

def isit(value):
    return isinstance(value, List) or isinstance(value, Tuple)


assert (type(lst) == list) | (type(lst) == tuple), "Not a valid lst type, cannot be string"


Just do this

if type(lst) in (list, tuple):
    # Do stuff


in python >3.6

import collections
isinstance(set(),collections.abc.Container)
True
isinstance([],collections.abc.Container)
True
isinstance({},collections.abc.Container)
True
isinstance((),collections.abc.Container)
True
isinstance(str,collections.abc.Container)
False


I tend to do this (if I really, really had to):

for i in some_var:
   if type(i) == type(list()):
       #do something with a list
   elif type(i) == type(tuple()):
       #do something with a tuple
   elif type(i) == type(str()):
       #here's your string
0

精彩评论

暂无评论...
验证码 换一张
取 消