I need to design a system which has these bas开发者_运维问答ic components:
- A Webserver which will be getting ~100 requests/sec. The webserver only needs to dump data into raw data repository.
- Raw data repository which has a single table which gets 100 rows/s from the webserver.
- A raw data processing unit (Simple processing, not much. Removing invalid raw data, inserting missing components into damaged raw data etc.)
- Processed data repository
Does it make sense in such a system to have a service layer on which all components would be built? All inter-component interaction will go through the service layers. While this would make the system easily upgradeable and maintainable, would it not also have a significant performance impact since I have so much traffic to handle?
Here's what can happen unless you guard against it.
In the communication between layers, some format is chosen, like XML. Then you build it and run it and find out the performance is not satisfactory.
Then you mess around with profilers which leave you guessing what the problem is.
When I worked on a problem like this, I used the stackshot technique and quickly found the problem. You would have thought it was I/O. NOT. It was that converting data to XML, and parsing XML to recover data structure, was taking roughly 80% of the time. It wasn't too hard to find a better way to do that. Result - a 5x speedup.
What do you see as the costs of having a separate service layer?
How do those costs compare with the costs you must incur? In your case that seems to be at least
- a network read for the request
- a database write for raw data
- a database read of raw data
- a database write of processed data
Plus some data munging.
What sort of services do you have a mind? Perhaps
- saveRawData()
- getNextRawData()
- writeProcessedData()
why is the overhead any more than a procedure call? Service does not need to imply "separate process" or "web service marshalling".
I contend that structure is always of value, separation of concerns in your application really matters. In comparison with database activities a few procedure calls will rarely cost much.
In passing: the persisting of Raw data might best be done to a queuing system. You can then get some natural scaling by having many queue readers on separate machines if you need them. In effect the queueing system is naturally introducing some service-like concepts.
Personally feel that you might be focusing too much on low level implementation details when designing the system. Before looking at how to lay out the components, assemblies or services you should be thinking of how to architect the system.
You could start with the following high level statements from which to build your system architecture around:
- Confirm the technical skill set of the development team and the operations/support team.
- Agree on an initial finite list of systems that will integrate to your service, the protocols they support and some SLAs.
- Decide on the messaging strategy.
- Understand how you will deploy your service/system.
- Decide on the choice of middleware (ESBs, Message Brokers, etc), databases (SQL, Oracle, Memcache, DB2, etc) and 3rd party frameworks/tools.
- Decide on your caching and data latency strategy.
- Break your application into the various areas of business responsibility - This will allow you to split up the work and allow easier communication of milestones during development/testing and implementation.
- Design each component as required to meet the areas of responsibility. The areas of responsibility should automatically lead you to decide on how to design component, assembly or service.
Obviously not all of the above will match your specific case but I would suggest that they should at least be given some thought.
Good luck.
Abstraction and tiering will introduce latency, but the real question is, what are you GAINING to make the cost(s) worthwhile? Loose coupling, governance, scalability, maintainability are worth real $.
Even the best-designed layered app will exhibit more latency than an app talking directly to a DB. Users who know the original system will feel the difference. They may not like it, so this can be a political issue as much as a technical one.
精彩评论