I am using a 3rd party library function which reads a set of keywords from a file, and is supposed to return a tuple of values. It does this correctly as long as there are at least two keywords. However, in the case where there is only one keyword, it returns a raw string, not a tuple of size one. This is particularly pernicious because when I try to do something like
for keyword in library.get_keywords():
# Do something with keyword
, in the case of the single keyword, the for
iterates over each character of the string in succession, which throws no exception, at run-time or otherwise, but is nevertheless completely useless to me.
My question is two-fold:
Clearly this is a bug in the library, which is out of my control. How can I best work around it?
Secondly, in general, if I am writing a function that returns a tuple, what is the best practice for ensuring tuples with one element are correctly generated? For example, if I have
def tuple_maker(values):
my_tuple = (values)
return my_tuple
for val in tuple_maker("a string"):
print "Value was", val
for val in tuple_maker(["str1", "str2", "str3"]):
print "Value was", val
I get
Value was a
Value was
Value was s
Value was t
Value was r
Value was i
Value was n
Value was g
Value was str1
Value was str2
Value was str3
What is the best way to modify the function my_tuple
to actually return a tuple when there is only a single element? Do I explicitly need to check whether the size is 1, and create the tuple seperately, using the (value,)
syntax? This implies that any function that has the possibility of returning a single-valued tuple must do this, which seems hacky and repetitive.
Is there开发者_如何学C some elegant general solution to this problem?
You need to somehow test for the type, if it's a string or a tuple. I'd do it like this:
keywords = library.get_keywords()
if not isinstance(keywords, tuple):
keywords = (keywords,) # Note the comma
for keyword in keywords:
do_your_thang(keyword)
For your first problem, I'm not really sure if this is the best answer, but I think you need to check yourself whether the returned value is a string or tuple and act accordingly.
As for your second problem, any variable can be turned into a single valued tuple by placing a ,
next to it:
>>> x='abc'
>>> x
'abc'
>>> tpl=x,
>>> tpl
('abc',)
Putting these two ideas together:
>>> def make_tuple(k):
... if isinstance(k,tuple):
... return k
... else:
... return k,
...
>>> make_tuple('xyz')
('xyz',)
>>> make_tuple(('abc','xyz'))
('abc', 'xyz')
Note: IMHO it is generally a bad idea to use isinstance, or any other form of logic that needs to check the type of an object at runtime. But for this problem I don't see any way around it.
Your tuple_maker
doesn't do what you think it does. An equivalent definition of tuple maker
to yours is
def tuple_maker(input):
return input
What you're seeing is that tuple_maker("a string")
returns a string, while tuple_maker(["str1","str2","str3"])
returns a list of strings; neither return a tuple!
Tuples in Python are defined by the presence of commas, not brackets. Thus (1,2)
is a tuple containing the values 1
and 2
, while (1,)
is a tuple containing the single value 1
.
To convert a value to a tuple, as others have pointed out, use tuple
.
>>> tuple([1])
(1,)
>>> tuple([1,2])
(1,2)
The ()
have nothing to do with tuples in python, the tuple syntax uses ,
. The ()
-s are optional.
E.g.:
>>> a=1, 2, 3
>>> type(a)
<class 'tuple'>
>>> a=1,
>>> type(a)
<class 'tuple'>
>>> a=(1)
>>> type(a)
<class 'int'>
I guess this is the root of the problem.
There's always monkeypatching!
# Store a reference to the real library function
really_get_keywords = library.get_keywords
# Define out patched version of the function, which uses the real
# version above, adjusting its return value as necessary
def patched_get_keywords():
"""Make sure we always get a tuple of keywords."""
result = really_get_keywords()
return result if isinstance(result, tuple) else (result,)
# Install the patched version
library.get_keywords = patched_get_keywords
NOTE: This code might burn down your house and sleep with your wife.
Rather than checking for a length of 1, I'd use the isinstance built-in instead.
>>> isinstance('a_str', tuple)
False
>>> isinstance(('str1', 'str2', 'str3'), tuple)
True
Is it absolutely necessary that it returns tuples, or will any iterable do?
import collections
def iterate(keywords):
if not isinstance(keywords, collections.Iterable):
yield keywords
else:
for keyword in keywords:
yield keyword
for keyword in iterate(library.get_keywords()):
print keyword
for your first problem you could check if the return value is tuple using
type(r) is tuple
#alternative
isinstance(r, tuple)
# one-liner
def as_tuple(r): return [ tuple([r]), r ][type(r) is tuple]
the second thing i like to use tuple([1])
. think it is a matter of taste. could probably also write a wrapper, for example def tuple1(s): return tuple([s])
There is an important thing to watch out for when using the tuple() constructor method instead of the default type definition for creating your single-string tuples. Here is a Nose2/Unittest script you can use to play with the problem:
#!/usr/bin/env python
# vim: ts=4 sw=4 sts=4 et
from __future__ import print_function
# global
import unittest
import os
import sys
import logging
import pprint
import shutil
# module-level logger
logger = logging.getLogger(__name__)
# module-global test-specific imports
# where to put test output data for compare.
testdatadir = os.path.join('.', 'test', 'test_data')
rawdata_dir = os.path.join(os.path.expanduser('~'), 'Downloads')
testfiles = (
'bogus.data',
)
purge_results = False
output_dir = os.path.join('test_data', 'example_out')
def cleanPath(path):
'''cleanPath
Recursively removes everything below a path
:param path:
the path to clean
'''
for root, dirs, files in os.walk(path):
for fn in files:
logger.debug('removing {}'.format(fn))
os.unlink(os.path.join(root, fn))
for dn in dirs:
# recursive
try:
logger.debug('recursive del {}'.format(dn))
shutil.rmtree(os.path.join(root, dn))
except Exception:
# for now, halt on all. Override with shutil onerror
# callback and ignore_errors.
raise
class TestChangeMe(unittest.TestCase):
'''
TestChangeMe
'''
testdatadir = None
rawdata_dir = None
testfiles = None
output_dir = output_dir
def __init__(self, *args, **kwargs):
self.testdatadir = os.path.join(os.path.dirname(
os.path.abspath(__file__)), testdatadir)
super(TestChangeMe, self).__init__(*args, **kwargs)
# check for kwargs
# this allows test control by instance
self.testdatadir = kwargs.get('testdatadir', testdatadir)
self.rawdata_dir = kwargs.get('rawdata_dir', rawdata_dir)
self.testfiles = kwargs.get('testfiles', testfiles)
self.output_dir = kwargs.get('output_dir', output_dir)
def setUp(self):
'''setUp
pre-test setup called before each test
'''
logging.debug('setUp')
if not os.path.exists(self.testdatadir):
os.mkdir(self.testdatadir)
else:
self.assertTrue(os.path.isdir(self.testdatadir))
self.assertTrue(os.path.exists(self.testdatadir))
cleanPath(self.output_dir)
def tearDown(self):
'''tearDown
post-test cleanup, if required
'''
logging.debug('tearDown')
if purge_results:
cleanPath(self.output_dir)
def tupe_as_arg(self, tuple1, tuple2, tuple3, tuple4):
'''test_something_0
auto-run tests sorted by ascending alpha
'''
# for testing, recreate strings and lens
string1 = 'string number 1'
len_s1 = len(string1)
string2 = 'string number 2'
len_s2 = len(string2)
# run the same tests...
# should test as type = string
self.assertTrue(type(tuple1) == str)
self.assertFalse(type(tuple1) == tuple)
self.assertEqual(len_s1, len_s2, len(tuple1))
self.assertEqual(len(tuple2), 1)
# this will fail
# self.assertEqual(len(tuple4), 1)
self.assertEqual(len(tuple3), 2)
self.assertTrue(type(string1) == str)
self.assertTrue(type(string2) == str)
self.assertTrue(string1 == tuple1)
# should test as type == tuple
self.assertTrue(type(tuple2) == tuple)
self.assertTrue(type(tuple4) == tuple)
self.assertFalse(type(tuple1) == type(tuple2))
self.assertFalse(type(tuple1) == type(tuple4))
# this will fail
# self.assertFalse(len(tuple4) == len(tuple1))
self.assertFalse(len(tuple2) == len(tuple1))
def default_test(self):
'''testFileDetection
Tests all data files for type and compares the results to the current
stored results.
'''
# test 1
__import__('pudb').set_trace()
string1 = 'string number 1'
len_s1 = len(string1)
string2 = 'string number 2'
len_s2 = len(string2)
tuple1 = (string1)
tuple2 = (string1,)
tuple3 = (string1, string2)
tuple4 = tuple(string1,)
# should test as type = string
self.assertTrue(type(tuple1) == str)
self.assertFalse(type(tuple1) == tuple)
self.assertEqual(len_s1, len_s2, len(tuple1))
self.assertEqual(len(tuple2), 1)
# this will fail
# self.assertEqual(len(tuple4), 1)
self.assertEqual(len(tuple3), 2)
self.assertTrue(type(string1) == str)
self.assertTrue(type(string2) == str)
self.assertTrue(string1 == tuple1)
# should test as type == tuple
self.assertTrue(type(tuple2) == tuple)
self.assertTrue(type(tuple4) == tuple)
self.assertFalse(type(tuple1) == type(tuple2))
self.assertFalse(type(tuple1) == type(tuple4))
# this will fail
# self.assertFalse(len(tuple4) == len(tuple1))
self.assertFalse(len(tuple2) == len(tuple1))
self.tupe_as_arg(tuple1, tuple2, tuple3, tuple4)
# stand-alone test execution
if __name__ == '__main__':
import nose2
nose2.main(
argv=[
'fake',
'--log-capture',
'TestChangeMe.default_test',
])
You will notice that the (nearly) identical code calling tuple(string1,) shows as type tuple, but the length will be the same as the string length and all members will be single characters.
This will cause the assertions on lines #137, #147, #104 and #115 to fail, even though they are seemingly identical to the ones that pass.
(note: I have a PUDB breakpoint in the code at line #124, it's an excellent debug tool, but you can remove it if you prefer. Otherwise simply pip install pudb
to use it.)
精彩评论