Wednesday, March 24, 2010

Capacity planning for business IT Systems

Capacity planning is a critical part of all Information Technology (IT) environments. It ensures that the number of servers, licenses, physical memory, bandwidth, disk space, room space, etc is sized properly for the maximum return on investment. This ensures that the user experience is positive, while ensuring that unused capacity is not being paid for and managed without a valid return on investment. Capacity planning is ensuring that the workload for a given environment is properly understood and mapped to create a link between a specific number of users and the amount of infrastructure needed to handle those users’ applications. That information is then laid out on a calendar to ensure that the load over time is understood so that capacity can be added and removed as necessary, without a negative impact on user experience.

This document is meant to serve as a guide to what information to consider when beginning the process of capacity planning within your environment. The purpose is to list the most common considerations and strategies for ensuring your capacity plan is an adequate model for infrastructure growth. First, let’s cover some terms and concepts that are important to understanding capacity planning:

Types of Capacity – Capacity is a very broad term, and within the realm of IT there are a variety of places that need to be considered when evaluating current and future capacity needs. While these are separate areas within the same context, each of them directly affects each other. The most common areas of focus for capacity planning are compute power (CPU speed and quantity), memory capacity (both RAM and hard disk), bandwidth (both within a single server and between devices), space (data center, office and storage facilities), data center (power and cooling), and human capitol (staff and contractors for operating the environment, and the relevant skills they posses).

365-day Calendar – Every business has highs and lows in terms of capacity needs. These can vary from month to month, and as often as hourly within the same day. As part of any long term capacity plan, a long-term calendar is needed to show the highs and lows in capacity needs. This should incorporate in holidays that affect the load on the environment, the needs of the business for reporting and trending and any audit needs based on industry specific rules.

User Load relative to Capacity – This is a formal mapping of a specific user load, to a defined amount of capacity with constraints around user response time, availability and user experience. This is the building block for a corporate wide capacity plan and enables staff to understand how much capacity must be added based on user growth. Most organizations will use their Service Level Agreements (SLAs) for setting this relationship.

The process to developing a capacity plan can be long and involve many steps depending on the complexity of the environment, dynamic nature of the work load, and the type of software being using. These are the most common steps (in no particular order) for information gathering as part of developing this capacity plan:

Load Testing – This is the process of testing specific user loads on a known capacity of hardware. These, when done multiple times on varying configurations can develop a capacity model for how many users a set of hardware can support at maximum without performance degradation.

Review of similar environments/workloads – This step is to ensure that knowledge gained within the industry, in similar environments is applied to your capacity plan and your specific environment. This step is not meant to assume some other workload is identical to yours, it is probably not, but there probably are similar workloads that can provide guidance on what to test in the Load Testing phase and what models need to be developed to properly plan capacity.

Trial and Error – A large part of load testing is trial and error. Many environments are simply too large to fully test in a development or test environment. This trial and error can be done in a strategic manner, testing types of capacity needs that are the most likely to be impacted by large or abnormal loads.

SLAs – This is the process of documenting what contractual requirements are in place for ensuring the users get the level of availability, uptime, and performance they expect and have paid for with the service.

The above steps are part of the technical process to developing a capacity plan that is unique to your environment and its needs. These, in addition to carefully documenting when and how to increase capacity can ensure that when the environment hits pre-set triggers, capacity can be added easily, ensuring a consistent user experience.

Capacity planning is ensuring that a clearly defined user experience can be mapped to a specific amount of infrastructure to support that experience. This plan should include not only what increments capacity can be added in, but also what triggers cause that capacity increase to occur. This plan, when associated with a calendar of business needs, trends and holidays can ensure the proactive growth of the environment and a consistent user experience, without having excess capacity that is costing the firm money and not being fully utilized.

Monday, March 8, 2010

Remote Team Dynamics

In recent years companies have increased the speed at which they downsized offices and subsequently hired more staff "working remote." "Working remote" can include a variety of alternative working arrangements, but is most commonly characterized by staff that work primarily from their home, or the customer location. This has created many teams that the staff are distributed across the country and the world. These remote employees typically have the freedom to work the hours they are most productive, as well as at the location they are most comfortable at, this could include coworking spaces, coffee shops or parts.

One significant change as a team becomes more distributed and remote is that communication channels and patterns must evolve to ensure staff feel the same level of connection that they would if they worked in a traditional office setting. Communication models must adapt to ensure that staff not only feel connected to their team and manager, but that they have effective methods to reach out to their team for discussions, advise and coordination.

I have worked in a variety of roles where my manager and I were in different states, as well as managed teams spread out as far as Australia, while I was based in Texas. This presented a unique challenge in ensuring that all team members had the same information and capabilities to do their job, regardless of their specific locations or timezones. Below are a few of the most successful methods I have found for managing a team that is distributed:

Weekly Team Meetings

The primary method for team communication, pass down and discussion should be a weekly call. This provides a known, consistent forum for the team to discuss changes within the team, within the company and pass down information from management to the team. The focus of the call should be kept on items and topics that are relevant to the majority of the team, sideline discussions should be scheduled at a different time to discuss topics in detail that do not interest or affect the entire group. Regular team calls are a great opportunity to foster team trust. These calls provide a place for team members to share their knowledge and experience as well as allow for open communication on issues that need a second opinion or escalation.

The time of the day that these calls are held is critical to ensuring maximum participation and limiting the impact of the call on the regular work of the team. For teams that are spread across timezones it is beneficial to hold calls at alternating times, either presenting the information twice, or to ensure that if one timezone must be up for a very early call, they do no have to make that sacrifice every week, but other regions have calls at off times periodically as well. Another option is to record the calls so that folks can listen to them at a more convenient time.

An agenda for the call should always be sent ahead of time, this will allow people to prepare for the call. An agenda can also be used to set time limits for various topics to ensure that one topic does not unexpectedly consume the entire schedule time for the call.

Finally, meeting notes should be provided after each call. These reinforce any policies stated to the team and allow the staff a reference to refer too later should they forget what was said or decided on the call. These meeting notes can also serve as the official record for any decisions that require review and approval by the team or management.

Roundtable

Every call should contain a roundtable, this provides all team members a brief period to share lessons learned that impact the rest of the team, and allow folks to understand what their peers are working on and may be able to collaborate on.

Each person's time should be limited so that they may mention one highlight and one lowlight of the week. The purpose is to share lessons learns with the team so that best practices can be shared across the organization.

Alternative Team Communication

In addition to a regular team call, there are several other methods that can be used for communicating with the team and building strong bonds between members, regardless of location.

Team Discussion List - An email distribution list should be available for the members of the remote team to communicate on topics that would normally warrant a hallway conversation. This could be technical discussions, product discussions or questions posed to the team about a customer or product. This forum provides the team a known path for team communication and input.

Team Watercooler List - One item that gets missed a lot with remote teams is the loss of hallway discussions on personal issues or announcements. A separate distribution list should be available to the team for topic discussion that is not immediately applicable to the company, but allows employees on the team to get to know each other better and share good news from their personal lives. This allows a name and personality to be shared for each member of the team.


Remote teams are a new challenge that companies are beginning to experience as more and more staff work from home or other alternative working arrangements. By having regular communication with the team, it allows these staff that are separated to keep in close contact, develop a trust for one another and ensure all team members have quick paths for discussion with the team. Ensuing communication flows regularly ensures that these remote employees feel connected to the team and have the information they need to be successful in their roles at the company.