I have a table of 100,000开发者_高级运维s of USERS (name, age, gender, phone, company, street, city, state, country, zipcode, etc).
I also have a table of thousands of PROMOTIONS which are offered to users. Now, for each promotion I need to add a rule which defines which subset of users it applies to.
For example, a rule might be: All users which have a zipcode of 10000 to 19999.
Another rule might be: All users which are female and do NOT live in the U.S.
Yet a third rule could be: Users older than 18 that live in DC, MD or VA. Etc etc.
Now given a specific user, I want to (efficiently!) find out which PROMOTIONS apply to that user.
What is a good strategy of modeling these rules, given that I want to be able to quickly find matching promotions given a specific user?
Here is one approach.
If you can break down each rule to a simple comparison, it is possible to store it like that in a way which will work for everything. What you end up with is a sequence of rule items comprising a field, a value and a comparison (e.g. equals, not equals, etc). They also need to belong to an actual rule (which you can name), which would be listed in another table.
Now it gets complicated. Each rule item also a level. This is so you can do AND and OR comparisons as they require a heirarchy. The way I've done this before is that Even levels AND and Odd levels OR. That means two items that must be true for a rule would both be at level 0, that is, they must both validate. For a rule that has one mandatory condition and two more where either can be true, i.e. 1 AND (2 OR 3), you put the mandatory rule at level 0 and the other two at level 1.
The advantage of this approach is that the code to assemble the SQL filters get the nesting right because it has to get the AND/OR mode right. It also makes it possible to build an editor for the rules.
However, this solution, like all the others, still requires applying all the rules in turn to your user sets to find which rules allow the user you're interested in.
What if you stored your rules in "SQL-like" form? For example, you have a USERS table and a PROMOTIONS table as you already mentioned.
You could add 2 new tables, PROMOTION_RULE and PROMOTION_PROMOTION_RULE. The PROMOTION_RULE would contain the rules, and PROMOTION_PROMOTION_RULE would link a given promotion to its associated rule(s). This design allows you to share the same rule for multiple promotions.
In PROMOTION_RULE, you could store a column called RULE_SQL, and you could store values like:
(gender != 'F' and country != 'US')
(age > 18 and state IN ('DC','MD','VA')
etc.
Then you could look up the rules that apply for a promotion, and dynamically build an SQL statement which would be comprised of the rule_sql values.
It depends on who is entering the rules and the level of trust you have for them. If it is literally you who is adding the rules, you could store them as SQL statements in the database, but that seems hacky, or as stored procedures, but I'm not sure if you can have a column that references those (it could be a string column, of course).
You don't say what kind of framework you're using, but if you're using an ORM, your rules could be query objects, which can be serialized to the database as BLOBs. For example, if you were using SQLAlchemy for Python, each rule could be an instance of Query that you could pass to the User.filter function.
In all of these cases, you're allowing arbitrary code to be injected, so they're not suitable for allowing others to add rules. The only good solution would be to create a limited domain-specific language to express the rules, and convert those to either SQL or your ORM language.
Seams to me that your rules can really be anything and are better represented by different queries.
They should go into the Model (MVC) or maybe into stored procedures. Can't really make my mind upon it.
Everytime a new promotion comes into town the system would go over all users and store the matches into a table linking users and promotions. That would also be true to every new user registering. You could also consider putting it into a cronjob.
EDIT:
Do note that the table that links the users directly to the promotions is crucial if you are really looking for performance.
Just my two cents.
This is not something normally done at the database level; instead "rules" would be interpreted by your backend (whatever language you're using there). How they are stored in the database then depends on how they are modeled. This approach can be as simple or as complex as it needs to be - rules can be dynamically defined by your end users, etc...
That said, if you're looking for quick-and-dirty (with HUGE emphasis on dirty) solution, you can perhaps store your rules as actual queries (or where
conditions for queries). You can then write a stored procedures to apply said rules.
精彩评论