I work for Fortune 500, which struggles with accurately measuring performance and accessibility for high-availability applications (i.e., applications that make up 99.5% when switching to 5 seconds per page). We determine both planned and unplanned downtime to determine this availability number. However, we recently added CDN to the mix, which complicates our performance a bit. CDN now processes about 75% of our traffic, sending the remainder to our own servers.
We are trying to measure what we call the โtrue user experienceโ (ie our test scripts emulate the typical user who clicked the application). These monitoring scripts are located outside our network, which means that we press CDN 75% of the time.
Management decided that we were using the worst case scenario for accessibility. Therefore, if our origin servers have problems, but the CDN serves content well, we still use accessibility. The same is true and vice versa. My thought is that as long as the "user experience" is successful, we should not unnecessarily punish ourselves. After all, a CDN to increase productivity and availability!
I'm just wondering if anyone knows how other Fortune 500 companies calculate their availability numbers? I look at apple.com, for example, at a storefront that uses a CDN that never seems to drop down (unless there is an announcement about a significant product). It would be great to have some hard factual evidence because I believe we need to unnecessarily harm these metrics. We make business decisions based on these numbers.
I can say, however, given that these indicators are visible to management, problems are resolved and resolved quite quickly (read: we quickly break through the red ribbon). Unfortunately, as a developer, I do not want the application to work up or down, because some external factor (i.e. CDN) affects the numbers.
Thoughts?
( Sanoj, ServerFault... - ? https://serverfault.com/questions/119186/looking-for-a-recommendation-on-measuring-a-high-availability-app-that-is-using-a)