开发者

Parsing indeterminate amount of data into a python tuple

开发者 https://www.devze.com 2023-01-16 05:27 出处:网络
I have a config file that contains a list of strings. I need to read these strings in order and store them in memory and I\'m going to be iterating over them many times when certain events take place.

I have a config file that contains a list of strings. I need to read these strings in order and store them in memory and I'm going to be iterating over them many times when certain events take place. Since once they're read from the file I don't need to add or modify the list, a tuple seems like the most appropriate data struc开发者_StackOverflow社区ture.

However, I'm a little confused on the best way to first construct the tuple since it's immutable. Should I parse them into a list then put them in a tuple? Is that wasteful? Is there a way to get them into a tuple first without the overhead of copying/destroying the tuple every time I add a new element.


As you said, you're going to read the data gradually - so a tuple isn't a good idea after all, as it's immutable.

Is there a reason for not using a simple list for holding the strings?


Since your data is changing, I am not sure you need a tuple. A list should do fine.

Look at the following which should provide you further information. Assigning a tuple is much faster than assigning a list. But if you are trying to modify elements every now and then then creating a tuple may not make more sense.

  • Are tuples more efficient than lists in Python?


I wouldn't worry about the overhead of first creating a list and then a tuple from that list. My guess is that the overhead will turn out to be negligible if you measure it.

On the other hand, I would stick with the list and iterate over that instead of creating a tuple. Tuples should be used for struct like data and list for lists of data, which is what your data sounds like to me.


with open("config") as infile:
    config = tuple(infile)


You may want to try using chained generators to create your tuple. You can use the generators to perform multiple filtering and transformation operations on your input without creating intermediate lists. All of the generator processing is delayed until iteration. In the example below the processing/iteration all happens on the last line.

Like so:

f = open('settings.cfg')
step1 = (tuple(i.strip() for i in l.split(':', 1)) for l in f if len(l) > 2 and ':' in l)
step2 = ((l[0], ',' in l[1] and 'Tag' in l[0] and l[1].split(',') or l[1]) for l in step1)
t = tuple(step2)
0

精彩评论

暂无评论...
验证码 换一张
取 消