开发者

Problem about high performance streaming data process

开发者 https://www.devze.com 2023-03-13 20:43 出处:网络
I have a socket connection that receives streaming data. Everyday the message count is about 150 Mil. After I receive the message, I need to process them. Since the message number is quite large. I am

I have a socket connection that receives streaming data. Everyday the message count is about 150 Mil. After I receive the message, I need to process them. Since the message number is quite large. I am multithreading the message processing code. Right now I have 8 threads, with 8 message queues. The socket process will put the message to these queues sequentially. Each process just need to process messages in their own queue.

My problem right now is that my queue is overflowing.

Should I have more queues? All threads are going to write to a shared object, using synchronized method. Will more queues be affecting each other and turn things worse?

Should I have bigger buffer? These seems to be safe to do but I really want to process the messages faster.

Should I c开发者_运维百科hange my design? Is there any good recommendation? Guides to follow?

Any comments are welcome.


why are you using separate queues? the usual way to distribute work is to have 1 shared queue from which all the workers read. in java you can easily do this using a shared BlockingQueue. this way, the jobs can be distributed more evenly (a worker won't be pulling jobs off of the queue if it is busy). in your strategy, a slow worker's queue can end up building up a backlog. to keep the queue from overflowing, you can put a max size on the queue, and then the producer will pause when your backlog gets too big.

you mention that you want the whole thing to go faster. while the above recommendations may or may not help, the only way to truly solve the problem is to run the system under a profiler and see where the bottleneck is (many times, it's not what you think it is). otherwise, you can spend a lot of time optimizing code that doesn't end up helping. there are plenty of good free profilers for java (netbeans, jvisualvm, eclipse) and c++ (valgrind). a great non-free one for java is yourkit java profiler.


Does every processing thread write to the shared object after it has processed just one message? That could create a bottleneck. Try accumulating some temporary results in each thread before writing to the shared object.

0

精彩评论

暂无评论...
验证码 换一张
取 消