First of all, I'm looking at the interactions between around 300 elements. Each e开发者_运维知识库lement will interact with all others and itself. In a minority of these cases, a reaction will occur, and I will mark that reaction.
Since this is basically a matrix with around 90,000 possible interactions, I want to manage this data with code so I can easily test the interactions however I would like to until I have them all tested. Since I obviously won't do all of them in one sitting, the data would have to be stored on disk somehow.
Here's my question: What would be the ideal data structure design for this? I generally use relational databases for data storage, and this particular problem doesn't seem to mesh very well with an RDB. Please let me know if I'm not being clear.
Nothing wrong with an RDBMS here: the important thing isn't getting the data in, but reporting on it afterwards, and one can't tell from your description what your needs will be.
As far as storing 300x300 results: you only need to record the reactions, not every test. That, and how far through the matrix you've gotten in your testing.
Note that 90k records isn't really very much data, so you could keep it all if you wanted.
Edit: all you need is a couple of tables:
Elements
--------
ItemID
... -- whatever identifying info you need
Crossref
--------
ItemX int
ItemY int
Results -- whatever data you need
For what it's worth: if the tuple {ItemX, ItemY} is equivalent to {ItemY, ItemX}, then you're not doing 300x300 comparisons, you're doing (300 + 299 + 298 + ... + 1) = 45150.
I think in terms of data structures, you've answered your question in your post. A matrix, I believe, would be the easiest way to go about this. In terms of storing it, as egrunin said, 90k records is not a lot. You can store this in a DB or a flat file somewhere. Simply store the pairs that have already gone through your testing, ie, (A1,A2),(A1,A3),...
精彩评论