How to manage the memory of data structures and heap of a virtual machine in plain C_问答_开发者

How to manage the memory of data structures and heap of a virtual machine in plain C

开发者 https://www.devze.com 2023-01-25 13:17 出处：网络

In my interpreter I need to manage its runtime objects, along with its internal data structures. I would want to create an interpreter in which there is no difference between the interpreter data str

In my interpreter I need to manage its runtime objects, along with its internal data structures.

I would want to create an interpreter in which there is no difference between the interpreter data structures (stack, symbol table) and the objects created by the user. I have seen this first in Little Smalltalk.

This way the interpreter looks like a tiny real machine, in respect to the interpreter structures living in the managed heap and all being of the same type (like the von Neumann architecture). I think this is the most cool and exciting way to write an interpreter.

But I would want to do a bit differently, creating the managed objects as C structs, and not arrays, like normally is done. The problem with C structs, arise when I would try to garbage collect or resize the heap. The pointers would be invalidated.

Somebody have figured on how to do this with pointers? I k开发者_开发知识库now this is pratically impossible, but somebody came near it?

Doug Lea wrote the basis for some of the malloc implementations out there, back in 1994.

You can download the public domain source:

http://g.oswego.edu/dl/html/malloc.html

I had similar concerns for my Postscript interpreter. Some discussion here.

The way I got around pointer invalidation is to have two layers of addressing. Virtual addresses, if you will.

The master data structure is allocated separately and has pointers to the base address, the maximum allocated size, and the currently used size. At address zero in the memory is the first address table which contains allocation sizes and base-offsets (integers, or "cursors", which when added to the base-pointer yield an actual C pointer to the data).

Since the interpreter is non-interrupting, operator functions need not worry about pointers becoming invalid during their execution, so the overhead of fetching the pointer from the handle (a table-index, the actual user pointer), need only happen once pointer fetched.

It seemed to work well in all tests. I got hung up later needing to optimize the inner loop.