Sunday, May 4, 2008

Platform Decisions – Solution Architecture

Without a doubt, one of the oldest and most heated debates in the non-Windows world is emacs versus vi. Similarly the on going debate about the correct platforms on which to conduct business is rapidly approaching the same fervor. These platforms include a plethora of options including operating system choice (Windows, Linux, Solaris, HP-UX, etc) and systems architecture (cluster, grid, SMP), interconnects (Ethernet, Infiniband) and storage (NFS, GPFS, SAN, etc). We are going to avoid this altercation and focus on the questions that must be asked when objectively evaluating the systems architecture question. In this article we will focus strictly on the systems architecture question: do we use commodity parts to assemble into a cluster or grid, or do we purchase a more custom SMP type solution for running a company’s applications?

Before we start, it should be noted that there are a variety of definitions for cluster and grid in the industry today, for the purposes of this discussion we will define a cluster as an environment of identical commodity hardware, used to run a specific set of less than a dozen applications. A grid on the other hand is an interconnected set of systems used to run a variety of applications, usually numbering in the dozens or hundreds. These applications on the grid environmentmay cover a wide range of business needs and business units with little to no similarities in how they operate. Finally, with SMP we are talking about a system comprised of greater than or equal to eight processor sockets in a single domain. An SMP system is not built of commodity hardware like a cluster or grid, but is instead a purpose designed and built system. An SMP type platform could run any number of applications, both commercial and custom.

The discussion of which platform is best for an organization should encompass many things including implementation costs, staff skills, maintenance costs, platform capabilities, growth expectations and usage models. While some would assume that this is primarily a technical discussion, the majority of decisions are actually financial. Specifically, this discussion should revolve mostly around how the company is going to benefit over the long-term from the choices that are made. This decision is not an easy one for any company. It can involve legacy code and processes, a lack ofunderstanding or experience with currently unused platforms, and ultimately personal feelings about a given solution.

While it is important to seek the best technical solutions, the platform decision will ultimately be the one that primarily suits the company both from a business productivity as well as financial perspective. Clearly this decision should be made after consulting the various levels within the organization including users, system management personnel, line organization management and company executives. The users and system management representatives will be able to provide input on intended uses and technical capabilities to manage the new system. Line organization management will be able to provide input on how their departments will be able to use any new capabilities afforded them. The executives will add the strategic thinking to the mix.

Ultimately, this information should be presented to executive management for the decision making process. They will ensure that the input from the various stakeholders within the company is appropriately considered. They will also properly evaluate the financial benefits that the proposed solution offers. Finally, they will ensure that the company’s strategic business plans and goals are properly evaluated. There is no way that a team without corporate scope can adequately review the relevant questions that cover the entire organization. Further a higher-level team can ensure that a solution will meet more stakeholder needs than if the evaluation were done at any departmental level.

Correspondingly, I believe the evaluation team should ask the following questions when evaluating future system choices:
  • What is the purchase price of this system?
  • What is the implementation cost of this system? This should include not only costs to migrate applications and data, but also facility costs like additional power and cooling.
  • Will this system integrate in with existing technologies used for storage, networking, authentication, and security?
  • What is the yearly support cost of this system? This value should include the cost of staff to maintain the system, the cost to power and cool the system, the cost for regular preventive maintenance, and the cost of training for staff so they are up to date on managing and maintaining the system.
  • What is the cost of adding new users to this system? If this system is successful and additional line-organizations within the company would like to utilize it, what will the costs to the organization and company be to migrate them to the new system?
  • What will utilization of this system look like? Will this system provide a higher utilization rate because of its architecture then competing solutions? Clusters and Grids are primarily used in an organization to provide an environment that can adjust as the needs of various departments change.
  • What capabilities can this system provide to assist with the company’s core competencies? Are there new business tools and methodologies that can be employed because of this added capability?
  • What growth is expected on this system over the life of the system? What are the overall vendor and in-house lifecycles of the system? This is important so that the budgeting organizations can be prepared when it is time to upgrade or replace the system.
  • What systems do the applications run most efficiently (e.g. SMP versus grid)? What type of interconnect (e.g. Ethernet, Infiniband, etc.) will provide the most efficient communications? How much memory is needed per core for the most efficient calculations? An analysis of all applications that will run on this platform should be done. This review of the applications should also include vendor communication about their recommended architectures with a key focus on ease of management and stabilit

By asking and answering these questions a company can get a complete accounting of which solution will provide the bestlong term benefit in cost, improved utilization-rates and more efficient growth. Ultimately the decision that is best for a company is the one that makes the most financial sense.

I believe that by going with a grid based solution, a growing and dynamic company will have a system in place that will easily change as the company’s needs and directions change. A grid based solution can provide most companies with aplatform that will provide capability for today’s needs, and change to accommodate tomorrows. Certainly I believe that enterprise grids will be one of the strongest candidate architectures for providing solutions for a company with diverse business processes.

In the coming weeks I intend to evaluate the other components including networking, operating systems and storage and look at what questions must be asked for each of these areas.

No comments: