I have an application that serves artifacts from files (pages from PDF files as images), the original PDF files live on S3 and they are downloaded to the servers that generate the images when a client hits one of them. These machines have a local caching mechanism that guarantees that each PDF file is downloaded only once.
So, when a client comes开发者_开发百科 with a request give me page 1 of pdf 123.pdf this cache is checked, if there is no pdf file in there, it's downloaded from S3 and stored at the local cache and then a process generates this page 1 and sends the image back to the client.
The client itself does not know it's connected to a special server, it all looks like it's just accessing the website server, but, for the sake of performance I would like to make sure this client is always going to be directed to the same file server that served it's first request (and downloaded the file from S3).
I could just set a cookie on the client to get him to always download from that specific file server, but placing this on the client leads to unfair usage, as some users are going to open many documents and some are not so I would like to perform this load balancing at the resource level (PDF document).
Each document has a unique identification (integer primary key in the database) and my first solution was using Redis and storing the document id as a key and the value is the host of the server machine that currently has this document cached, but I would like to remove Redis or look for a simpler way to implement this that would not require looking for keys somewhere else.
Also, it would be nice if the defined algorithm or idea would allow for adding more file servers on the fly.
What would be the best way to perform this kind of load balancing with affinity based on resources?
Just for the sake of saying, this app is a mix of Ruby, java and Scala.
I'd use the following approach in the load balancer:
- Strip the requested resource URL to remove the query and fragment parts.
- Turn the stripped URL into a String and take its hashcode.
Use the hashcode to select the back end server from the list of available servers; e.g.
String[] serverNames = ... String serverName = serverNames[hash % serverNames.length];
This spreads the load evenly across all servers, and always sends the same request to the same server. If you add more servers, it adjusts itself ... though you take a performance hit while the caching warms up again.
I don't think you want to aim for "fairness"; i.e. some kind of guarantee that each requests takes roughly the same time. To achieve fairness you need to actively monitor the load on each backend and dispatch according to load. That's going to (somewhat) negate the caching / affinity, and is going to consume resources to do the measurement and load-balancing decision making. A dumb load spreading approach (e.g. my suggestion) should give you better throughput overall for your use-case.
精彩评论