There are two types of load testing - identification and bandwidth bottlenecks. This question makes me think that this is due to bottlenecks, so the number of users is something like a red herring, instead the goal for this configuration is to find areas that can be improved to increase concurrency.
Application bottlenecks typically fall into three categories: database, memory leak, or slow algorithm. Their detection is due to the fact that the application is subject to discussion under voltage (i.e. load) for a long period of time - at least an hour, possibly up to several days. Jmeter is a good tool for this purpose. One of the things to consider is to run the same tests with cookies enabled (i.e., Jmeter saves cookies and sends with each subsequent request) and is disabled - sometimes you get very different results, and this is important, because the latter is effective mimics what some crawlers do to your site. The following are bottleneck detection information:
Database
Tables without indexes or SQL statements that contain multiple joins are frequent application bottlenecks. Every database server I came across, MySQL, SQL Server and Oracle has some way to register or identify slow SQL statements. MySQL has a slow query log, while SQL Server has dynamic management views that track the slowest SQL. Once you have mastered the slow statements, use an explanation plan to see what it is trying to use the database engine to use any functions that offer indexes, and consider other strategies such as denormalization - if these two options do not solve the bottleneck,
Memory leak
Include a detailed garbage collection log and a JMX monitoring port. Then use jConsole, which provides much better graphs to observe trends. In particular, leaks usually appear as filling the spaces of the old general or Perm. Leaks are a bottleneck as the JVM spends more and more time trying to collect garbage unsuccessfully until an OOM error is thrown.
Perm Gen implies the need for more space as a command-line option for the JVM. Whereas Old Gen implies a leak, where you have to stop the load test, create a bunch of heaps, and then use the Eclipse memory analysis tool to identify the leak.
Slow algorithm
This is harder to track. The most common intruders are synchronization, interprocess communication (for example, RMI, web services) and disk I / O. Another common problem is code using nested loops (see mom O (n ^ 2)!).
The best way I found out that these problems are missing is that deeper knowledge generates stack traces. They will tell you what all the threads are doing at a given time. What you are looking for are BLOCKED threads or multiple threads accessing the same code. This usually indicates some slowness within the code base.