Stress testing for a distributed SOA architecture based System_问答_开发者

Stress testing for a distributed SOA architecture based System

开发者 https://www.devze.com 2023-01-02 03:38 出处：网络

We currently have a system with 20 SOA services and having a single master mysql database and 2 slave nodes. We currently have 10 GB of data in the database. We have a requirement in which开发者_StackOverflow社区 the data in a table is going to be significantly increased. We want to stress test the system before proceeding with the implementation. What kind of stress testing makes sense for this kind of distributed environment?

Also, when performing the stress testing, I can look out for latency and metrics like what is the latency for servicing 90% of the requests for the services. Are there any other good metrics for services? What metrics should I look for the mysql database?

Thank you

Here are a few ideas:

Set up a test database and load the additional data into the table where you're expecting the increase; use a multiple of the increase you're expecting, e.g if you're expecting the table to increase by 2000 rows then add 4000 rows to your test table.
Enable slow query logging in MySQL.
Make sure the logging levels in your SOA servers are detailed enough to debug errors from your stress test.
Use a load test tool such as JMeter to run multiple requests against each service in rapid succession. Use a multiple of the number of requests per second you're expecting; I usually ramp up with 2x, 4x, 8x etc. times the number of expected requests.
Repeat the above test against each indivudual service in turn.
Repeat the above test with a "typical" mix of services - e.g. if you expect twice as many requests to service 1 than to service 2 then reflect that in the test.
If you want to test reliability as well, try repeating the JMeter tests with one or both of the MySQL slave nodes taken offline.

JMeter should give you all the latency info you need. Another useful "real world" data point form JMeter that I like to use is the 90% query time, which is the response time value that's greater than or equal to 90% of the test responses.

The idea in this scenario is still to try and model the page requests and posts as they are used in production. The difference is to run a load test with just a copy of the 10Gb of current production data. Then simulate the extra data and run the same load test. You will be able to compare the responses of the pages that uses the services or check the service calls directly.

You can then see what effect the extra data will have on your service calls.

The metrics that are most important are the response times for the calls that you expect (or have measured) to be called most frequently.

Other statistics on the database and servers themselves can be analysed if you discover a performance issue.