Is there a good patterns for distributed software and one backend database for this problem?_问答_开发者

Is there a good patterns for distributed software and one backend database for this problem?

开发者 https://www.devze.com 2023-01-25 19:20 出处：网络

I\'m looking for a high level answer, but here are some specifics in case it helps, I\'m deploying a J2EE app to a cluster in WebLogic. There\'s one Oracle database at the backend.

I'm looking for a high level answer, but here are some specifics in case it helps, I'm deploying a J2EE app to a cluster in WebLogic. There's one Oracle database at the backend.

A normal flow of the app is

- users feed data (to be inserted as rows) to the app

- the app waits for the data to reach a certain size and does a batch insert into the database (only 1 commit)

There's a constraint in the database preventing "duplicate" data insertions. If the app gets a constraint violation, it will have to rollback and re-insert one row at a time, so the duplicate rows can be "renamed" and inserted.

Suppose I had 2 running instances of the app. Each of the instances is about to insert 1000 rows. Even if there is only 1 duplicate, one instance will have to rollback and insert rows one by one.

I can easily see that it would be smarte开发者_Python百科r to re-insert the non-conflicting 999 rows as a batch in this instance, but what if I had 3 running apps and the 999 rows also had a chance of duplicates?

So my question is this: is there a design pattern for this kind of situation?

This is a long question, so please let me know where to clarify. Thank you for your time.

EDIT: The 1000 rows of data is in memory for each instance, but they cannot see the rows of each other. The only way they know if a row is a duplicate is when it's inserted into the database.

And if the current application design doesn't make sense, feel free to suggest better ways of tackling this problem. I would appreciate it very much.

http://www.oracle-developer.net/display.php?id=329

The simplest would be to avoid parallel processing of the same data. For example, your size or time based event could run only on one node or post a massage to a JMS queue, so only one of the nodes would process it (for instance, by using similar duplicate-check, e.g. based on a timestamp of the message/batch).