we build a 30M+ users' online community, which has RESTful services in it's back-end and a front-end which utilizes them. My concern is: Is it OK to use REST as internal data transfer protocol, or it will significantly drop the开发者_运维百科 performance, compared with Java's binary serialization protocol (language dependent)? What other approaches/protocols can be used to keep it language independent and maximally fast?
The REST approach can be quite ok, but the http layer can slow things down. If your use REST in the back-end, your should make sure that the connection between your back-end and front-end is kept open and not reopened with every request.
More details about http keep-alive can be found here: http://en.wikipedia.org/wiki/HTTP_persistent_connection
One advantage that REST gives you between front- and back-end layers is the flexibility to add a layer of HTTP caching in between to boost the performance without needing to modify the either of the existing layers. The same holds true for load-balancing for scaling out the back-end, since HTTP load-balancers are very well understood and easy to deploy.
These two benefits of REST can result in a major benefit over more traditional RPC serialization techniques, depending on the situation, especially if you have "slow" back-end processes that can benefit from caching or being load-balanced.
The other place REST wins out is if you need to expand the client base using the back-end services (which I think you hinted at with the desire for language independence). No only does a REST-based service layer allow you to intermingle client languages freely, but it also allows you to easily open up your API to 3rd-party developers with almost no extra effort. Having a platform for others to build on has proven to be wildly successful as a business model and it never hurts to keep your development as open and flexible as possible.
This is something that you will have to measure and compare before making decisions. It depends on what information is transferred, how often etc. Serialization may not be the bottleneck. But it will be a good idea to consider Protocol Buffers at this scale.
精彩评论