Wednesday, March 24, 2010

Capacity planning for business IT Systems

Capacity planning is a critical part of all Information Technology (IT) environments. It ensures that the number of servers, licenses, physical memory, bandwidth, disk space, room space, etc is sized properly for the maximum return on investment. This ensures that the user experience is positive, while ensuring that unused capacity is not being paid for and managed without a valid return on investment. Capacity planning is ensuring that the workload for a given environment is properly understood and mapped to create a link between a specific number of users and the amount of infrastructure needed to handle those users’ applications. That information is then laid out on a calendar to ensure that the load over time is understood so that capacity can be added and removed as necessary, without a negative impact on user experience.

This document is meant to serve as a guide to what information to consider when beginning the process of capacity planning within your environment. The purpose is to list the most common considerations and strategies for ensuring your capacity plan is an adequate model for infrastructure growth. First, let’s cover some terms and concepts that are important to understanding capacity planning:

Types of Capacity – Capacity is a very broad term, and within the realm of IT there are a variety of places that need to be considered when evaluating current and future capacity needs. While these are separate areas within the same context, each of them directly affects each other. The most common areas of focus for capacity planning are compute power (CPU speed and quantity), memory capacity (both RAM and hard disk), bandwidth (both within a single server and between devices), space (data center, office and storage facilities), data center (power and cooling), and human capitol (staff and contractors for operating the environment, and the relevant skills they posses).

365-day Calendar – Every business has highs and lows in terms of capacity needs. These can vary from month to month, and as often as hourly within the same day. As part of any long term capacity plan, a long-term calendar is needed to show the highs and lows in capacity needs. This should incorporate in holidays that affect the load on the environment, the needs of the business for reporting and trending and any audit needs based on industry specific rules.

User Load relative to Capacity – This is a formal mapping of a specific user load, to a defined amount of capacity with constraints around user response time, availability and user experience. This is the building block for a corporate wide capacity plan and enables staff to understand how much capacity must be added based on user growth. Most organizations will use their Service Level Agreements (SLAs) for setting this relationship.

The process to developing a capacity plan can be long and involve many steps depending on the complexity of the environment, dynamic nature of the work load, and the type of software being using. These are the most common steps (in no particular order) for information gathering as part of developing this capacity plan:

Load Testing – This is the process of testing specific user loads on a known capacity of hardware. These, when done multiple times on varying configurations can develop a capacity model for how many users a set of hardware can support at maximum without performance degradation.

Review of similar environments/workloads – This step is to ensure that knowledge gained within the industry, in similar environments is applied to your capacity plan and your specific environment. This step is not meant to assume some other workload is identical to yours, it is probably not, but there probably are similar workloads that can provide guidance on what to test in the Load Testing phase and what models need to be developed to properly plan capacity.

Trial and Error – A large part of load testing is trial and error. Many environments are simply too large to fully test in a development or test environment. This trial and error can be done in a strategic manner, testing types of capacity needs that are the most likely to be impacted by large or abnormal loads.

SLAs – This is the process of documenting what contractual requirements are in place for ensuring the users get the level of availability, uptime, and performance they expect and have paid for with the service.

The above steps are part of the technical process to developing a capacity plan that is unique to your environment and its needs. These, in addition to carefully documenting when and how to increase capacity can ensure that when the environment hits pre-set triggers, capacity can be added easily, ensuring a consistent user experience.

Capacity planning is ensuring that a clearly defined user experience can be mapped to a specific amount of infrastructure to support that experience. This plan should include not only what increments capacity can be added in, but also what triggers cause that capacity increase to occur. This plan, when associated with a calendar of business needs, trends and holidays can ensure the proactive growth of the environment and a consistent user experience, without having excess capacity that is costing the firm money and not being fully utilized.

No comments: