1) The acceptable failure rate and the acceptable types of failures are really policy decisions. Its important to note that the relevant response time, for determining policy, is the users and this is dependent upon other factors such as page size, bandwidth, and render time. I like the following criteria, but its highly dependent upon the application in question. a. The server should only rarely be run at full capacity, perhaps a few times a year. b. Under average load the server should generally respond in less than two seconds. c. Under high load the server should generally respond in less than five seconds. I think its ok to have a percent or two of requests over this amount. d. The server should not timeout or fail. 2) I think this is a common problem. Performing a load test on any type of server is much better than nothing, but extrapolating throughput is not generally possible. Exceptions to this rule largely occur when you are horizontally scaling and you have some reason to believe it will be fairly linear. I have been forced to guess in the past. I like to generally insist upon the same memory configuration for my test servers and then when horribly pressed, I just guess at a factor to multiply by. Even then, I dont report my guesses to customers under any circumstances. 3) This is a pretty big topic. I think the www.javaperformancetuning.com has some nice content, but I dont have a specific article link to give you.
Thanks for the feedback! Ivan
|