开发者

ctypes variable length structures

开发者 https://www.devze.com 2023-03-27 08:20 出处:网络
Ever since I read Dave Beazley\'s post on binary I/O handling (http://dabeaz.blogspot.com/2009/08/python-binary-io-handling.html) I\'ve wanted to create a Python library for a certain wire protocol. H

Ever since I read Dave Beazley's post on binary I/O handling (http://dabeaz.blogspot.com/2009/08/python-binary-io-handling.html) I've wanted to create a Python library for a certain wire protocol. However, I can't find the best solution for variable length structures. Here's what I want to do:

import ctypes as c

class Point(c.Structure):
    _fields_ = [
        ('x',c.c_double),
        ('y',c.c_double),
        ('z',c.c_double)
        ]

class Points(c.Structure):
    _fields_ = [
        ('num_points', c.c_uint32),
        ('points', Point*num_points) # num_points not yet defined!
        ]

The class Points won't work since num_points isn't defined yet. I could redefine the _fields_ variable later once num_points is known, but since it's a class variable it would effect all of the othe开发者_JS百科r Points instances.

What is a pythonic solution to this problem?


The most straightforward way, with the example you gave is to define the structure just when you have the information you need.

A simple way of doing that is creating the class at the point you will use it, not at module root - you can, for example, just put the class body inside a function, that will act as a factory - I think that is the most readable way.

import ctypes as c



class Point(c.Structure):
    _fields_ = [
        ('x',c.c_double),
        ('y',c.c_double),
        ('z',c.c_double)
        ]

def points_factory(num_points):
    class Points(c.Structure):
        _fields_ = [
            ('num_points', c.c_uint32),
            ('points', Point*num_points) 
            ]
    return Points

#and when you need it in the code:
Points = points_factory(5)

Sorry - It is the C code that will "fill in" the values for you - that is not the answer them. WIll post another way.


And now, for something completly different - If all you need is dealing with the Data, possibly the "most Pythonic" way is not trying to use ctypes to handle raw data in memory at all.

This approach just uses struct.pack and .unpack to serialiase/unserialize teh data as it moves on/off the your app. The "Points" class can accept the raw bytes, and creates python objects from that, and can serialize the data trough a "get_data" method. Otherwise, it is just am ordinary python list.

import struct

class Point(object):
    def __init__(self, x=0.0, y=0.0, z= 0.0):
        self.x, self.y, self.z = x,y,z
    def get_data(self):
        return struct.pack("ddd", self.x, self.y, self.z)


class Points(list):
    def __init__(self, data=None):
        if data is None:
            return
        pointsize = struct.calcsize("ddd")
        for index in xrange(struct.calcsize("i"), len(data) - struct.calcsize("i"), pointsize):
            point_data = struct.unpack("ddd", data[index: index + pointsize])
            self.append(Point(*point_data))

    def get_data(self):
        return struct.pack("i", len(self)) + "".join(p.get_data() for p in self)


This question is really, really, old:

I have a simpler answer, which seems strange, but avoids metaclasses and resolves the issue that ctypes doesn't allow me to directly build a struct with the same definition as I can in C.

The example C struct, coming from the kernel:

struct some_struct {
        __u32   static;
        __u64   another_static;
        __u32   len;
        __u8    data[0];
};

With ctypes implementation:

import ctypes
import copy

class StructureVariableSized(ctypes.Structure):
    _variable_sized_ = []

    def __new__(self, variable_sized=(), **kwargs):
        def name_builder(name, variable_sized):
            for variable_sized_field_name, variable_size in variable_sized:
                name += variable_sized_field_name.title() + '[{0}]'.format(variable_size)
            return name

        local_fields = copy.deepcopy(self._fields_)
        for matching_field_name, matching_type in self._variable_sized_:
            match_type = None
            for variable_sized_field_name, variable_size in variable_sized:
                if variable_sized_field_name == matching_field_name:
                    match_type = matching_type
                    break
            if match_type is None:
                raise Exception
            local_fields.append((variable_sized_field_name, match_type*variable_size))
        name = name_builder(self.__name__, variable_sized)
        class BaseCtypesStruct(ctypes.Structure):
            _fields_ = local_fields
            _variable_sized_ = self._variable_sized_
        classdef = BaseCtypesStruct
        classdef.__name__ = name
        return BaseCtypesStruct(**kwargs)


class StructwithVariableArrayLength(StructureVariableSized):
    _fields_ = [
        ('static', ctypes.c_uint32),
        ('another_static', ctypes.c_uint64),
        ('len', ctypes.c_uint32),
        ]
    _variable_sized_ = [
        ('data', ctypes.c_uint8)
    ]

struct_map = {
    1: StructwithVariableArrayLength
}
sval32 = struct_map[1](variable_sized=(('data', 32),),)
print sval32
print sval32.data
sval128 = struct_map[1](variable_sized=(('data', 128),),)
print sval128
print sval128.data

With sample output:

machine:~ user$ python svs.py 
<__main__.StructwithVariableArrayLengthData[32] object at 0x10dae07a0>
<__main__.c_ubyte_Array_32 object at 0x10dae0830>
<__main__.StructwithVariableArrayLengthData[128] object at 0x10dae0830>
<__main__.c_ubyte_Array_128 object at 0x10dae08c0>

This answer works for me for a couple reasons:

  1. The argument to the constructor can be pickled, and has no references to types.
  2. I define all of the structure inside of the StructwithVariableArrayLength definition.
  3. To the caller, the structure looks identical as if I had just defined the array inside of _fields_
  4. I have no ability to modify the underlying structure defined in the header file, and accomplish my goals without changing any underlying code.
  5. I don't have to modify any parse/pack logic, this only does what I'm trying to do which is build a class definition with a variable length array.
  6. This is a generic, reusable container that be sent into the factory like my other structures.

I would obviously prefer the header file took a pointer, but that isn't always possible. That answer was frustrating. The others were very tailored to the data structure itself, or required modification of the caller.


So, just as in C, you can't do exactly what you do want. The only useful way of working with a structure that does what you want in C is to have it as

struct Points {
   int num_points;
   Point *points;
}

And have utility code to alloc you memory where you can put your data. Unless you have some safe maxsize, and don't want to bother with that part of the code (memory allocation) - the network part of the code would then transmit just the needed data from within the structure, not the whole of it.

To work with Python ctypes with a structure member which actually contains a pointer to where your data is (and so, may be of variable length) - you will also have to alloc and free memory manually (if you are filling it on the python side) - or just read the data - f creating and destroying the data is done on native code functions.

The structure creating code can be thus:

import ctypes as c



class Point(c.Structure):
    _fields_ = [
        ('x',c.c_double),
        ('y',c.c_double),
        ('z',c.c_double)
        ]

class Points(c.Structure):
    _fields_ = [
        ('num_points', c.c_uint32),
        ('points', c.POINTER(Point))
        ]

And the code to manage the creation and deletion of these data structures can be:

__all_buffers = {}
def make_points(num_points):
   data = Points()
   data.num_points = num_points
   buf = c.create_string_buffer(c.sizeof(Point) * num_points)
   __all_buffers[c.addressof(buf)] = buf
   p = Point.from_address(c.addressof(buf))
   data.points = c.pointer(p)
   return data

def del_points(points):
    del __all_buffers[c.addressof(m.points[0])
    points.num_points = 0 

The use f the global variable "__all_buffers" keep a reference to the python-created buffer object so that python does not destroy it upon leaving the make_points structure. An alternative to this is to get a reference to either libc (on unixes) or winapi,and call system's malloc and freefunctions yourself

OR - you can just go with plain old "struct" Python module, instead of using ctypes - doubly so if you will have no C code at all, and are just using ctypes for the "structs" convenience.


Here's what I've come up with so far (still a little rough):

import ctypes as c

MAX_PACKET_SIZE = 8*1024
MAX_SIZE = 10

class Points(c.Structure):
    _fields_ = [
        ('_buffer', c.c_byte*MAX_PACKET_SIZE)
    ]
    _inner_fields = [
        ('num_points', c.c_uint32),
        ('points', 'Point*self.num_points')
    ]

    def __init__(self):
        self.num_points = 0
        self.points = [0,]*MAX_SIZE

    def parse(self):
        fields = []
        for name, ctype in self._inner_fields:
            if type(ctype) == str:
                ctype = eval(ctype)
            fields.append((name, ctype))
            class Inner(c.Structure, PrettyPrinter):
                _fields_ = fields
            inner = Inner.from_address(c.addressof(self._buffer))
            setattr(self, name, getattr(inner, name))
        self = inner
        return self

    def pack(self):
        fields = []
        for name, ctype in self._inner_fields:
            if type(ctype) == str:
                ctype = eval(ctype)
            fields.append((name, ctype))
        class Inner(c.Structure, PrettyPrinter):
            _fields_ = fields
        inner = Inner()
        for name, ctype in self._inner_fields:
            value = getattr(self, name)
            if type(value) == list:
                l = getattr(inner, name)
                for i in range(len(l)):
                    l[i] = getattr(self, name)[i]
            else:
                setattr(inner, name, value)
        return inner

The methods parse and pack are generic, so they could be moved to a metaclass. This would make it's use almost as easy as the snippet I first posted.

Comments on this solution? Still looking for something simpler, not sure if it exists.


You could use ctypes pointers to do this.

C struct

struct some_struct {
    uint  length;
    uchar data[1];
};

Python code

from ctypes import *

class SomeStruct(Structure):
    _fields_ = [('length', c_uint), ('data', c_ubyte)]

#read data into SomeStruct
s = SomeStruct()
ptr_data = pointer(s.data)
for i in range(s.length):
    print ptr_data[i]


If you're willing to consider a third-party package, you might be able to use Construct.

Let's take the structure you've provided:

import ctypes

class CPoint(ctypes.Structure):
    _fields_ = [
        ('x',ctypes.c_double),
        ('y',ctypes.c_double),
        ('z',ctypes.c_double)
    ]

Using Construct's syntax, we would define the equivalent as follows:

import construct

Point = construct.Struct(
    "x" / construct.Float64l,
    "y" / construct.Float64l,
    "z" / construct.Float64l
)

We can check that they are the same:

>>> point_coordinates = {"x": 3.14, "y": 2.71, "z": 1.41}
>>> c_point = CPoint(**point_coordinates)
>>> point = Point.build(point_coordinates)
>>> bytes(c_point) == bytes(point)
True

Now we define the Points structure according to the Construct syntax:

Points = construct.Struct(
    "num_points" / construct.Int32ul,
    "points" / construct.Array(construct.this.num_points, Point)
)

Construct will automatically create an array of points based on num_points.

We can serialize a Points structure:

>>> Points.build({"num_points": 2, "points": [{"x": 3.14, "y": 2.71, "z": 1.41}, {"x": 1.73, "y": 1.20, "z": 1.61}]})
b'\x02\x00\x00\x00\x1f\x85\xebQ\xb8\x1e\t@\xaeG\xe1z\x14\xae\x05@\x8f\xc2\xf5(\\\x8f\xf6?\xaeG\xe1z\x14\xae\xfb?333333\xf3?\xc3\xf5(\\\x8f\xc2\xf9?'

Or de-serialize it:

>>> res = Points.parse(b'\x02\x00\x00\x00\x1f\x85\xebQ\xb8\x1e\t@\xaeG\xe1z\x14\xae\x05@\x8f\xc2\xf5(\\\x8f\xf6?\xaeG\xe1z\x14\xae\xfb?333333\xf3?\xc3\xf5(\\\x8f\xc2\xf9?')
>>> print(res)
Container:
    num_points = 2
    points = ListContainer:
        Container:
            x = 3.14
            y = 2.71
            z = 1.41
        Container:
            x = 1.73
            y = 1.2
            z = 1.61

And of course access the structure fields:

>>> for i in range(res.num_points):
...     print(res.points[i].x)
...
3.14
1.73
0

精彩评论

暂无评论...
验证码 换一张
取 消