In simplistic terms, a feature structure is an unordered list of attribute-value pairs.
[number:s开发者_JS百科g, person:3 | _ ],
which can be embedded:
[cat:np, agr:[number:sg, person:3 | _ ] | _ ],
can subindex stuff and share the value
[number:[1], person:3 | _ ],
where [1] is another feature structure (that is, it allows reentrancy).
My question is: what data structure would people think this should be implemented with for later access to values, to perform unification between 2 fts, to "type" them, etc.
There is a full book on this, but it's in lisp, which simplifies list handling. So, my choices are: a hash of lists, a list of lists, or a trie. What do people think about this?
Think a little harder about what constitutes a value. I'd try the simplest thing that might possibly work:
typedef struct value {
enum { INT, BOOL, STRING, FEATURE_STRUCTURE } ty;
union {
int int;
bool bool;
char *string;
struct fs *feature_structure;
} u;
} *Value;
typedef struct fs * { // list of pairs; this rep could change
struct avpair *pair;
Value value;
} *Feature_Structure;
struct avpair {
const char *attribute;
Value value;
};
You'll want a bunch of constructor functions like
Value mkBool(bool p) {
Value v = malloc(sizeof(*v));
assert(v);
v->ty = BOOL;
v->u.bool = p;
return v;
}
At this point you can start to be in business. If "list of pairs" turns out not to be the right representation, you can change it. Without knowing what operations you plan or what your expectations are for cost model, I'd start off here. Then, if you need to move to something more efficient, I'd probably represent a feature structure using a ternary search tree and keep the same representation for Value
.
精彩评论